1 |
On Tuesday 27 July 2004 00:57, Brian Harring wrote: |
2 |
> > The basic problem in searching is actually that it isn't implemented |
3 |
> > smartly |
4 |
> > in current portage. I have working (emerge -s like) code that is |
5 |
> > blazingly |
6 |
> > fast as it does not actually open all ebuilds. |
7 |
> |
8 |
> Searching works off of the cache for the most part, if a cache entry is |
9 |
> stale, it's updated (eg the ebuild is opened and srced). |
10 |
> Unless you're not checking the cache and updating it as you proceed, |
11 |
> you're implementation ought to suffer the same limitation. |
12 |
|
13 |
Basically it does a directory glob selecting valid candidates. Those |
14 |
candidates are then checked whether they are real packages. If they are, they |
15 |
are valid results and returned. |
16 |
|
17 |
> There are 2 things that need to be done (in my books at least) to step |
18 |
> up the speed of a description search- |
19 |
> A) sql based cache backend, whether sqlite or mysql. Either that, or |
20 |
> extend the flat cache to store the descriptions in a central index. |
21 |
> B) alter the search description alg so that instead of stepping through |
22 |
> each entry getting the description, we just state "give me all packages |
23 |
> that have a description matching blar", and leave it up to the backend |
24 |
> to decide what is the most efficient way to search. With flat cache, |
25 |
> we'd still have to go file by file; w/ a sql variant, it could take |
26 |
> advantage of the appropriate syntax. |
27 |
|
28 |
Probably some kind of caching or tool (like makewhatis) is the way to go. An |
29 |
option would be to use grep first to limit the amount of candidate packages |
30 |
that get examined for real (grep is a lot cheaper than parsing). |
31 |
|
32 |
> Since there is code for a sql based cache backend, B has been bounced |
33 |
> around in #gentoo-portage a bit. Prior to it actually happening I |
34 |
> would think the sql db code would need to be cleaned up/QA'd/etc. |
35 |
> |
36 |
> Course, there still is the issue of verifying that the cache entry |
37 |
> isn't stale... :) |
38 |
|
39 |
For now on I don't have any persistent caching in my working code (except |
40 |
where it uses old code for accessing current ebuilds) to keep it simple. It |
41 |
actually allready is quite fast. |
42 |
|
43 |
> Err, eh? If the tree is corrupted, and sync'd against a |
44 |
> good/non-corrupted tree, it ought to be reverted to a sane state. |
45 |
|
46 |
Exactly |
47 |
|
48 |
Paul |
49 |
|
50 |
-- |
51 |
Paul de Vrieze |
52 |
Gentoo Developer |
53 |
Mail: pauldv@g.o |
54 |
Homepage: http://www.devrieze.net |