Gentoo Logo
Gentoo Spaceship




Note: Due to technical difficulties, the Archives are currently not up to date. GMANE provides an alternative service for most mailing lists.
c.f. bug 424647
List Archive: gentoo-performance
Navigation:
Lists: gentoo-performance: < Prev By Thread Next > < Prev By Date Next >
Headers:
To: gentoo-performance@g.o
From: "Paul de Vrieze" <pauldv@g.o>
Subject: Re: Re: portage performance
Date: Mon, 26 Jul 2004 11:18:28 +0200 (CEST)
>> > The BIG hitch in portage is the database strategy....it's file
>> > system based. Basicly it's thousands of small text files... you
>> > want to update
>> > the database?.... open, read, close over and over again....
>> >
>> > It sucks.
> Yup. Overhead's too large in today's filesystems.
>
>> > Portage is crying for an sql database backend... mysql, sqllite,
>> > mmsql...
>> > anything would be nice.
>> >
>> > Tell us more about your bsd ports, it sounds interesting...
>>
>> I believe FreeBSD uses an INDEX file in /usr/ports/INDEX, and then
>> compiles that file into a berkeley DB file or something in
>> /usr/ports/INDEX.db.
>>
>> I'm not really very fond of FreeBSD ports these days, actually.
>> It has the feel of something hackish, in desperate need of a
>> good bottom-up redesign.
>>
>> I'm starting to think that Gentoo's Portage is superior in almost
>> all ways, but I do think this search speed thing needs to be
>> dealt with.
> Definately.
>
>> I'm using esearch now, which is nice, but rebuilding the database
>> is a royal pain in the rear, and the database isn't kept in sync
>> between emerge runs.
>>
>> If the esearch database could be updated without having to rebuild
>> the entire thing (or at least without having to look at the
>> filesystem to rebuild the entire thing) after every emerge
>> operation then I think we'd be doing well.
> Hardly. Have you ever done a qpkg -q? I can drink enough coffee to
> severely upset my nervous system in the time it takes to finish.
> For some reason, most of it is in a sed (that looks like it could
> be done in a tr, and also like it's handling an insane amount of
> data).
>
> I think ocaml is a good suggestion - pretty fast, harder to make
> mistakes. But as also mentioned, the real problem is that portage
> is very filesystem based right now, which makes updating with rsync
> simple and bandwidth-efficient. Replacing something like that would
> take a lot of work, so some database building (relational db makes
> a fair bit of sense for some of this, though dependencies on
> postgres or mysql isn't. something smaller that can be made self
> contained like sqlite) is essential, and updating those
> (dependencies updates, world dependencies changes)
> should then be really thought about, 'cos without proper
> incremental/hashed update stuff it might turn out to be as useful
> as esearch, and rather more annoying as it's rather crucial
> base-system software we're talking about. And anything else one can
> do smarter is good.
>
> I'm tempted to give ocaml and possibly even my random ideas on the
> matter a shot as a toy project. Ugh, too much to do, though.

The basic problem in searching is actually that it isn't implemented smartly
in current portage. I have working (emerge -s like) code that is blazingly
fast as it does not actually open all ebuilds. Doing description searching is
impossible to do fast without some kind of cache. I don't think creating a
reliable cache for that is going to be a priority, but it is certainly
possible ;-).

As for rsync, the amount of files is too big, and I would like to reduce that
amount, but I don't see databases being a good replacement. We need something
that works in such a way that even a corrupted tree gets into a good status
after updating.

Paul

--
Paul de Vrieze
Gentoo Developer
Mail: pauldv@g.o
Homepage: http://www.devrieze.net


--
gentoo-performance@g.o mailing list

Replies:
Re: Re: portage performance
-- Bart Alewijnse
Re: Re: portage performance
-- Brian Harring
Re: Re: portage performance
-- Colin Kingsley
References:
portage performance
-- Jesse Guardiani
Re: portage performance
-- Jerry McBride
Re: portage performance
-- Jesse Guardiani
Re: Re: portage performance
-- Bart Alewijnse
Navigation:
Lists: gentoo-performance: < Prev By Thread Next > < Prev By Date Next >
Previous by thread:
Re: Re: portage performance
Next by thread:
Re: Re: portage performance
Previous by date:
Re: inline considered harmful
Next by date:
Re: Re: portage performance


Updated Jun 17, 2009

Summary: Archive of the gentoo-performance mailing list.

Donate to support our development efforts.

Copyright 2001-2013 Gentoo Foundation, Inc. Questions, Comments? Contact us.