Gentoo Archives: gentoo-portage-dev

From: Paul de Vrieze <pauldv@g.o>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] Current portage well designed, but badly used
Date: Sun, 28 Nov 2004 19:37:56
Message-Id: 200411282037.58501.pauldv@gentoo.org
In Reply to: [gentoo-portage-dev] Current portage well designed, but badly used by Gustavo Barbieri
1 On Sunday 28 November 2004 00:10, Gustavo Barbieri wrote:
2 > Hello,
3 >
4 > I'm playing with portage and noticed it's well designed, but there are
5 > some mistakes in its usage at the moment. For example:
6 >
7 > Categories are mixed: there is a net-www/apache and net-www/mod_*
8 > (apache modules), but there is a more convenient category www-apache/
9 > for them. This is one example, there are more mistakes. There is any
10 > plan to fix them in next portage releases?
11 >
12 > Some packages use numbering version padded with zero, that's good to
13 > list with shell functions, but it's bad because you can't change them
14 > to numbers and them back to string. For example:
15 > mail-mta/nullmailer-1.00_rc7-r4. If you Convert it to integers, it
16 > becomes 1.0 and you can't map back to the ebuild.
17 >
18 > Portage provides metadata.xml, cool. But it's hardly used :(
19 > metadata.xml seems to provide tags for maintainers, changelogs and
20 > long description, many (most?) packages don't use them.
21 >
22 > The portage library is too heavy, complicated and make things slow.
23 > Heavy and complicated I noticed from (trying to) look at the source,
24 > slow by usage. For example:
25 >
26 > time emerge # without parameters
27 > real 0m0.614s
28 > user 0m0.487s
29 > sys 0m0.046s
30 >
31 > time emerge -pv world # 16 packages to be upgraded
32 > real 0m22.664s
33 > user 0m12.423s
34 > sys 0m1.130s
35 >
36 > It's too much, look at debian apt, it's fast. And I can't see why
37 > portage is slow.
38 > Forgive me if I'm wrong, but portage just need to parse
39 > /var/lib/portage/world (237 entries in my case), them for each check
40 > if there is any other version greater than and if so check for
41 > dependencies. Why 22seconds? A hand made take less than 1.
42 >
43 >
44 > Also, a brief explanation on why I was playing with portage and some
45 > requests: I'm coding (for fun, no plan to get in a production state)
46 > yet another graphical package manager atop portage with the newbie in
47 > mind. But to achieve my goal I need:
48 >
49 > - a fast portage. Now I'm doing a module to do this for me (see
50 > more above), at least the basics, like get package information,
51 > versions, ... and if possible resolve primary dependencies (just to
52 > show to user in a tab "Dependencies", hidden by default).
53 >
54 > - more meta data, if possible a list of urls to screenshots (most
55 > packages have a screenshots section), if the url links to an html,
56 > provide a threshold of images size to get, so it connects and
57 > downloads every image bigger than it... cached of course.
58 >
59 > - portage to act as a daemon, queue requests and fetch packages.
60 > If portage could be a daemon with 3 threads: one that download
61 > packages, one that compiles and one to manage the other and accept
62 > requests; then it could schedule download to maximize download
63 > throughput, downloading smaller packages first while respecting
64 > dependencies, compile while download and wait until packages are there
65 > and the "emerge" command just send commands to it. It would be handy
66 > since compiling times are huge.
67 >
68 >
69 > About the fast portage: I know portage is a complex monster and is the
70 > heart of gentoo, if it breaks, everything breaks. But how about a
71 > python module to be used by other packages that just want to view the
72 > portage and its packages. If eventually this module works as expected
73 > and have every current portage feature, it could replace the old one.
74 > I started to code my own "fast portage", but some things are picky
75 > to do, and I want to know how you do that: how do you parse ebuilds to
76 > get USE, DESCRIPTION, SLOT, DEPEND, ... ?
77 > If you want to know why my implementation is fast: I use lazy
78 > evaluation as far as possible. For example, I load every package, but
79 > the attributes to available versions, installed versions, the status,
80 > are just calculated on deman, I use python property() and
81 > setters/getters for that. Since hardly you'll use every attribute from
82 > everythin, it loads much faster.
83 > I have preliminar code here:
84 > http://ltc08.ic.unicamp.br/~gustavo/packagemanager.tar.bz2, but some
85 > modifications I did were lost in a power outtage + xfs... I just have
86 > the .pyc, if someone knows how to get the .py back...
87
88 Well, as said, portage does not parse by itself, but uses bash. This is not
89 really fast. The biggest issue however is the absense of lazy evaluation.
90 I've been looking at it too, and even have a c++ based parser that can be
91 accessed as a python module, but it's undocumented and has issues as it is
92 not a full bash replacement, and ebuilds expect bash.
93
94 Paul
95
96 --
97 Paul de Vrieze
98 Gentoo Developer
99 Mail: pauldv@g.o
100 Homepage: http://www.devrieze.net