Gentoo Archives: gentoo-performance

From:	Paul de Vrieze <pauldv@g.o>
To:	gentoo-performance@l.g.o
Subject:	Re: [gentoo-performance] Re: portage performance
Date:	Mon, 26 Jul 2004 09:19:16
Message-Id:	`1501.221.136.16.201.1090833508.squirrel@221.136.16.201`
In Reply to:	Re: [gentoo-performance] Re: portage performance by Bart Alewijnse

1	>> > The BIG hitch in portage is the database strategy....it's file
2	>> > system based. Basicly it's thousands of small text files... you
3	>> > want to update
4	>> > the database?.... open, read, close over and over again....
5	>> >
6	>> > It sucks.
7	> Yup. Overhead's too large in today's filesystems.
8	>
9	>> > Portage is crying for an sql database backend... mysql, sqllite,
10	>> > mmsql...
11	>> > anything would be nice.
12	>> >
13	>> > Tell us more about your bsd ports, it sounds interesting...
14	>>
15	>> I believe FreeBSD uses an INDEX file in /usr/ports/INDEX, and then
16	>> compiles that file into a berkeley DB file or something in
17	>> /usr/ports/INDEX.db.
18	>>
19	>> I'm not really very fond of FreeBSD ports these days, actually.
20	>> It has the feel of something hackish, in desperate need of a
21	>> good bottom-up redesign.
22	>>
23	>> I'm starting to think that Gentoo's Portage is superior in almost
24	>> all ways, but I do think this search speed thing needs to be
25	>> dealt with.
26	> Definately.
27	>
28	>> I'm using esearch now, which is nice, but rebuilding the database
29	>> is a royal pain in the rear, and the database isn't kept in sync
30	>> between emerge runs.
31	>>
32	>> If the esearch database could be updated without having to rebuild
33	>> the entire thing (or at least without having to look at the
34	>> filesystem to rebuild the entire thing) after every emerge
35	>> operation then I think we'd be doing well.
36	> Hardly. Have you ever done a qpkg -q? I can drink enough coffee to
37	> severely upset my nervous system in the time it takes to finish.
38	> For some reason, most of it is in a sed (that looks like it could
39	> be done in a tr, and also like it's handling an insane amount of
40	> data).
41	>
42	> I think ocaml is a good suggestion - pretty fast, harder to make
43	> mistakes. But as also mentioned, the real problem is that portage
44	> is very filesystem based right now, which makes updating with rsync
45	> simple and bandwidth-efficient. Replacing something like that would
46	> take a lot of work, so some database building (relational db makes
47	> a fair bit of sense for some of this, though dependencies on
48	> postgres or mysql isn't. something smaller that can be made self
49	> contained like sqlite) is essential, and updating those
50	> (dependencies updates, world dependencies changes)
51	> should then be really thought about, 'cos without proper
52	> incremental/hashed update stuff it might turn out to be as useful
53	> as esearch, and rather more annoying as it's rather crucial
54	> base-system software we're talking about. And anything else one can
55	> do smarter is good.
56	>
57	> I'm tempted to give ocaml and possibly even my random ideas on the
58	> matter a shot as a toy project. Ugh, too much to do, though.
59
60	The basic problem in searching is actually that it isn't implemented smartly
61	in current portage. I have working (emerge -s like) code that is blazingly
62	fast as it does not actually open all ebuilds. Doing description searching is
63	impossible to do fast without some kind of cache. I don't think creating a
64	reliable cache for that is going to be a priority, but it is certainly
65	possible ;-).
66
67	As for rsync, the amount of files is too big, and I would like to reduce that
68	amount, but I don't see databases being a good replacement. We need something
69	that works in such a way that even a corrupted tree gets into a good status
70	after updating.
71
72	Paul
73
74	--
75	Paul de Vrieze
76	Gentoo Developer
77	Mail: pauldv@g.o
78	Homepage: http://www.devrieze.net
79
80
81	--
82	gentoo-performance@g.o mailing list

Replies

Subject	Author
Re: [gentoo-performance] Re: portage performance	Colin Kingsley <ckingsley@×××××.com>
Re: [gentoo-performance] Re: portage performance	Brian Harring <ferringb@g.o>
Re: [gentoo-performance] Re: portage performance	Bart Alewijnse <scarfboy@×××××.com>

Report Message

Find on MARC Find on Google Groups