Gentoo Archives: gentoo-performance

From: Jesse Guardiani <jesse@×××××××.net>
To: gentoo-performance@l.g.o
Subject: [gentoo-performance] Re: Re: portage performance
Date: Fri, 30 Jul 2004 14:13:02
Message-Id: cedl15$hqs$1@sea.gmane.org
In Reply to: Re: [gentoo-performance] Re: portage performance by Brian Harring
1 Brian Harring wrote:
2
3 > -----BEGIN PGP SIGNED MESSAGE-----
4 > Hash: SHA1
5 >
6 >
7 >> The basic problem in searching is actually that it isn't implemented
8 >> smartly
9 >> in current portage. I have working (emerge -s like) code that is
10 >> blazingly
11 >> fast as it does not actually open all ebuilds.
12 > Searching works off of the cache for the most part, if a cache entry is
13 > stale, it's updated (eg the ebuild is opened and srced).
14 > Unless you're not checking the cache and updating it as you proceed,
15 > you're implementation ought to suffer the same limitation.
16 >
17 >> Doing description searching is
18 >> impossible to do fast without some kind of cache. I don't think
19 >> creating a
20 >> reliable cache for that is going to be a priority,
21 > There are 2 things that need to be done (in my books at least) to step
22 > up the speed of a description search-
23 > A) sql based cache backend, whether sqlite or mysql.
24
25 If it comes to this, I definately vote for sqlite. It's much faster than
26 mysql for embedded apps, and you don't have to worry about whether or not
27 the server is running.
28
29
30 > Either that, or
31 > extend the flat cache to store the descriptions in a central index.
32
33 I think this is a good idea. AFAIK, this is what FreeBSD does. Perhaps
34 tinycdb could be used? (OCaml has a CDB module available too. You can
35 get it here: http://bleu.west.spy.net/~dustin/projects/ocaml/index.xtp
36 (BSD licence))
37
38
39 > B) alter the search description alg so that instead of stepping through
40 > each entry getting the description, we just state "give me all packages
41 > that have a description matching blar", and leave it up to the backend
42 > to decide what is the most efficient way to search. With flat cache,
43 > we'd still have to go file by file; w/ a sql variant, it could take
44 > advantage of the appropriate syntax.
45
46 I personally prefer well designed hash table schemes. This is part of a
47 base operating system, so it needs to be extremely efficient. IMO, SQL
48 databases tend to encourage laziness. This is not to say that they don't
49 have their place, but I don't think they belong in a package management
50 system.
51
52 Still, it's a novel idea, I think. And offers a great deal of flexibility.
53
54
55 --
56 Jesse Guardiani, Systems Administrator
57 WingNET Internet Services,
58 P.O. Box 2605 // Cleveland, TN 37320-2605
59 423-559-LINK (v) 423-559-5145 (f)
60 http://www.wingnet.net
61
62
63
64 --
65 gentoo-performance@g.o mailing list