Gentoo Archives: gentoo-portage-dev

From: Tambet <qtvali@×××××.com>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] Re: search functionality in emerge
Date: Tue, 02 Dec 2008 00:20:41
Message-Id: cea53e3c0812011620w94e8847vb3777d2b05832ded@mail.gmail.com
In Reply to: Re: [gentoo-portage-dev] Re: search functionality in emerge by Emma Strubell
1 2008/12/2 Emma Strubell <emma.strubell@×××××.com>
2
3 > True, true. Like I said, I don't really use overlays, so excuse my
4 > igonrance.
5 >
6
7 Do you know an order of doing things:
8
9 Rules of Optimization:
10
11 - Rule 1: Don't do it.
12 - Rule 2 (for experts only): Don't do it yet.
13
14 What this actually means - functionality comes first. Readability comes
15 next. Optimization comes last. Unless you are creating a fancy 3D engine for
16 kung fu game.
17
18 If you are going to exclude overlays, you are removing functionality - and,
19 indeed, absolutely has-to-be-there functionality, because noone would
20 intuitively expect search function to search only one subset of packages,
21 however reasonable this subset would be. So, you can't, just can't, add this
22 package into portage base - you could write just another external search
23 package for portage.
24
25 I looked this code a bit and:
26 Portage's "__init__.py" contains comment "# search functionality". After
27 this comment, there is a nice and simple search class.
28 It also contains method "def action_sync(...)", which contains
29 synchronization stuff.
30
31 Now, search class will be initialized by setting up 3 databases - porttree,
32 bintree and vartree, whatever those are. Those will be in self._dbs array
33 and porttree will be in self._portdb.
34
35 It contains some more methods:
36 _findname(...) will return result of self._portdb.findname(...) with same
37 parameters or None if it does not exist.
38 Other methods will do similar things - map one or another method.
39 execute will do the real search...
40 Now - "for package in self.portdb.cp_all()" is important here ...it
41 currently loops over whole portage tree. All kinds of matching will be done
42 inside.
43 self.portdb obviously points to porttree.py (unless it points to fake tree).
44 cp_all will take all porttrees and do simple file search inside. This method
45 should contain optional index search.
46
47 self.porttrees = [self.porttree_root] + \
48 [os.path.realpath(t) for t in self.mysettings["PORTDIR_OVERLAY"].split()]
49
50 So, self.porttrees contains list of trees - first of them is root, others
51 are overlays.
52
53 Now, what you have to do will not be harder just because of having overlay
54 search, too.
55
56 You have to create method def cp_index(self), which will return dictionary
57 containing package names as keys. For oroot... will be "self.porttrees[1:]",
58 not "self.porttrees" - this will only search overlays. d = {} will be
59 replaced with d = self.cp_index(). If index is not there, old version will
60 be used (thus, you have to make internal porttrees variable, which contains
61 all or all except first).
62
63 Other methods used by search are xmatch and aux_get - first used several
64 times and last one used to get description. You have to cache results of
65 those specific queries and make them use your cache - as you can see, those
66 parts of portage are already able to use overlays. Thus, you have to put
67 your code again in beginning of those functions - create index_xmatch and
68 index_aux_get methods, then make those methods use them and return their
69 results unless those are None (or something other in case none is already
70 legal result) - if they return None, old code will be run and do it's job.
71 If index is not created, result is None. In index_** methods, just check if
72 query is what you can answer and if it is, then answer it.
73
74 Obviously, the simplest way to create your index is to delete index, then
75 use those same methods to query for all nessecary information - and fastest
76 way would be to add updating index directly into sync, which you could do
77 later.
78
79 Please, also, make those commands to turn index on and off (last one should
80 also delete it to save disk space). Default should be off until it's fast,
81 small and reliable. Also notice that if index is kept on hard drive, it
82 might be faster if it's compressed (gz, for example) - decompressing takes
83 less time and more processing power than reading it fully out.
84
85 Have luck!
86
87 -----BEGIN PGP SIGNED MESSAGE-----
88 >> Hash: SHA1
89 >>
90 >> Emma Strubell schrieb:
91 >> > 2) does anyone really need to search an overlay anyway?
92 >>
93 >> Of course. Take large (semi-)official overlays like sunrise. They can
94 >> easily be seen as a second portage tree.
95 >> -----BEGIN PGP SIGNATURE-----
96 >> Version: GnuPG v2.0.9 (GNU/Linux)
97 >> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
98 >>
99 >> iEYEARECAAYFAkk0YpEACgkQ4UOg/zhYFuD3jQCdG/ChDmyOncpgUKeMuqDxD1Tt
100 >> 0mwAn2FXskdEAyFlmE8shUJy7WlhHr4S
101 >> =+lCO
102 >> -----END PGP SIGNATURE-----
103 >>
104 >> On Mon, Dec 1, 2008 at 5:17 PM, René 'Necoro' Neumann <lists@××××××.eu>wrote:
105 >
106 >

Replies

Subject Author
Re: [gentoo-portage-dev] Re: search functionality in emerge Emma Strubell <emma.strubell@×××××.com>
Re: [gentoo-portage-dev] Re: search functionality in emerge Alec Warner <antarus@g.o>