Gentoo Archives: gentoo-performance

From: Bart Alewijnse <scarfboy@×××××.com>
To: gentoo-performance@l.g.o
Subject: Re: [gentoo-performance] Re: portage performance
Date: Sun, 18 Jul 2004 10:29:17
In Reply to: [gentoo-performance] Re: portage performance by Jesse Guardiani
> > The BIG hitch in portage is the database's file system > > based. Basicly it's thousands of small text files... you want to update > > the database?.... open, read, close over and over again.... > > > > It sucks.
Yup. Overhead's too large in today's filesystems.
> > Portage is crying for an sql database backend... mysql, sqllite, mmsql... > > anything would be nice. > > > > Tell us more about your bsd ports, it sounds interesting... > > I believe FreeBSD uses an INDEX file in /usr/ports/INDEX, and then > compiles that file into a berkeley DB file or something in /usr/ports/INDEX.db. > > I'm not really very fond of FreeBSD ports these days, actually. > It has the feel of something hackish, in desperate need of a > good bottom-up redesign. > > I'm starting to think that Gentoo's Portage is superior in almost > all ways, but I do think this search speed thing needs to be > dealt with.
> I'm using esearch now, which is nice, but rebuilding the database is > a royal pain in the rear, and the database isn't kept in sync between > emerge runs. > > If the esearch database could be updated without having to rebuild > the entire thing (or at least without having to look at the filesystem > to rebuild the entire thing) after every emerge operation then I think > we'd be doing well.
Hardly. Have you ever done a qpkg -q? I can drink enough coffee to severely upset my nervous system in the time it takes to finish. For some reason, most of it is in a sed (that looks like it could be done in a tr, and also like it's handling an insane amount of data). I think ocaml is a good suggestion - pretty fast, harder to make mistakes. But as also mentioned, the real problem is that portage is very filesystem based right now, which makes updating with rsync simple and bandwidth-efficient. Replacing something like that would take a lot of work, so some database building (relational db makes a fair bit of sense for some of this, though dependencies on postgres or mysql isn't. something smaller that can be made self contained like sqlite) is essential, and updating those (dependencies updates, world dependencies changes) should then be really thought about, 'cos without proper incremental/hashed update stuff it might turn out to be as useful as esearch, and rather more annoying as it's rather crucial base-system software we're talking about. And anything else one can do smarter is good. I'm tempted to give ocaml and possibly even my random ideas on the matter a shot as a toy project. Ugh, too much to do, though. --Bart Alewijnse -- gentoo-performance@g.o mailing list


Subject Author
Re: [gentoo-performance] Re: portage performance Paul de Vrieze <pauldv@g.o>