Gentoo Archives: gentoo-performance

From: Bart Alewijnse <scarfboy@×××××.com>
To: gentoo-performance@l.g.o
Subject: Re: [gentoo-performance] Re: portage performance
Date: Sun, 18 Jul 2004 10:29:17
Message-Id: b71082d8040718032953e8592f@mail.gmail.com
In Reply to: [gentoo-performance] Re: portage performance by Jesse Guardiani
1 > > The BIG hitch in portage is the database strategy....it's file system
2 > > based. Basicly it's thousands of small text files... you want to update
3 > > the database?.... open, read, close over and over again....
4 > >
5 > > It sucks.
6 Yup. Overhead's too large in today's filesystems.
7
8 > > Portage is crying for an sql database backend... mysql, sqllite, mmsql...
9 > > anything would be nice.
10 > >
11 > > Tell us more about your bsd ports, it sounds interesting...
12 >
13 > I believe FreeBSD uses an INDEX file in /usr/ports/INDEX, and then
14 > compiles that file into a berkeley DB file or something in /usr/ports/INDEX.db.
15 >
16 > I'm not really very fond of FreeBSD ports these days, actually.
17 > It has the feel of something hackish, in desperate need of a
18 > good bottom-up redesign.
19 >
20 > I'm starting to think that Gentoo's Portage is superior in almost
21 > all ways, but I do think this search speed thing needs to be
22 > dealt with.
23 Definately.
24
25 > I'm using esearch now, which is nice, but rebuilding the database is
26 > a royal pain in the rear, and the database isn't kept in sync between
27 > emerge runs.
28 >
29 > If the esearch database could be updated without having to rebuild
30 > the entire thing (or at least without having to look at the filesystem
31 > to rebuild the entire thing) after every emerge operation then I think
32 > we'd be doing well.
33 Hardly. Have you ever done a qpkg -q? I can drink enough coffee to
34 severely upset my nervous system in the time it takes to finish. For
35 some reason, most of it is in a sed (that looks like it could be done
36 in a tr, and also like it's handling an insane amount of data).
37
38 I think ocaml is a good suggestion - pretty fast, harder to make mistakes.
39 But as also mentioned, the real problem is that portage is very filesystem
40 based right now, which makes updating with rsync simple and
41 bandwidth-efficient. Replacing something like that would take a lot of
42 work, so some database building (relational db makes a fair bit of sense for
43 some of this, though dependencies on postgres or mysql isn't. something
44 smaller that can be made self contained like sqlite) is essential, and
45 updating those (dependencies updates, world dependencies changes)
46 should then be really thought about, 'cos without proper incremental/hashed
47 update stuff it might turn out to be as useful as esearch, and rather more
48 annoying as it's rather crucial base-system software we're talking about.
49 And anything else one can do smarter is good.
50
51 I'm tempted to give ocaml and possibly even my random ideas on the matter
52 a shot as a toy project. Ugh, too much to do, though.
53
54 --Bart Alewijnse
55
56 --
57 gentoo-performance@g.o mailing list

Replies

Subject Author
Re: [gentoo-performance] Re: portage performance Paul de Vrieze <pauldv@g.o>