Gentoo Archives: gentoo-dev

From: Alec Warner <antarus@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future
Date: Mon, 01 Dec 2008 01:05:22
Message-Id: b41005390811301705teb32eafmc67c443a82d01e85@mail.gmail.com
In Reply to: [gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future by flameeyes@gmail.com (Diego 'Flameeyes' =?utf-8?Q?Petten=C3=B2?=)
1 On Sun, Nov 30, 2008 at 3:12 PM, Diego 'Flameeyes' Pettenò
2 <flameeyes@×××××.com> wrote:
3 > "Alec Warner" <antarus@g.o> writes:
4 >
5 >> Diego, What are the concrete benefits of your proposal?
6 >
7 > As I said:
8 >
9 > - no need to replicate homepage data between versions; even though forks
10 > can change homepage, I would expect that to at worse split in two a
11 > package, or have to be different by slot, like Java;
12 > - allows proper handling of packages lacking a HOMEPAGE;
13 > - less data in metadata cache;
14 > - users can check the metadata much more easily by just opening the xml
15 > file or interfacing to that rather than having to skim through the
16 > ebuild, the xml files are probably more user readable then ebuilds
17 > using multiple eclasses;
18 > - displaying info about the package does not require parsing the full
19 > ebuild file, with its eclasses;
20 > - extensible to provide more links than just the homepage (forums,
21 > trackers, gentoo-specific documentation, ...);
22 > - if we also move DESCRIPTION, search software can ignore everything
23 > about ebuild parsing, and just use the metadata.xml files; considering
24 > how many people actually use or used eix, it would make sense to allow
25 > third-party applications to be able to search through the tree;
26 > - webapps like packages.gentoo.org would be able to display basic
27 > information without having to parse the ebuilds or the metadata cache.
28 > - as much as people might think metadata is easier to parse than
29 > anything, XML has one huge advantage: there are plently of parsers for
30 > any language without having to actually write one, even as easy as it
31 > can be, and it's easily interfaced with anything; I wrote a simple XSL
32 > file that outputs the basic metadata details for packages without
33 > having any parser or executable code but xsltproc (or any other XSLT
34 > software), correlating data with herds.xml too;
35 > - it really is metadata, and it makes very little sense to need parsing
36 > of eclasses and EAPI handling to get some data from a package that is
37 > non-functional in nature and free form (just like DESCRIPTION, and
38 > unlike LICENSE like Alec said), and that changes at worse once each
39 > slot (unlike LICENSE that can change at any given version).
40 >
41 > Disadvantages:
42 >
43 > - it requires user-interface software to parse metadata.xml to show
44 > data for a package; which is already needed to show per-package USE
45 > flag meaning;
46 >
47 > General points:
48 >
49 > - it does not solve unrelated problems like code replication;
50 >
51 > Can someone come up with any other point beside "I don't like XML"
52 > (which I already said is a puny answer) and "it can theorically be 10
53 > different homepages for 10 different versions" (which I have sincerely
54 > some beef with myself since if you fork a software you might as well
55 > change its name)?
56 >
57 > As I said, moving out the HOMEPAGE field from a package manager
58 > prospective is non functional; if you're showing to the user some data
59 > about a package you might as well show as much as you can, like long
60 > descriptions, other links, and USE flags. And the fact that you can ask
61 > the package manager for something is for me not a valid reason to avoi
62 > moving something in a more approchable place for other software.
63
64 Ciaran covered most of my points already.
65
66 Third party programs should not parse ebuilds and eclasses by hand.
67 I'd expect half of them to get it wrong if they tried.
68 Ebuild parsing is hard, that is why we have three complex software
69 packages that for the most part do it properly.
70
71 Why is 'ask the package manager' an invalid reason to not making
72 something more accessible?
73 How accessible must this data be?
74
75 Writing an XML parser is not accessible enough (for me), we should
76 just put it in plain text on the hard drive, perhaps in
77 "/var/cache/edb/dep/${PORTDIR}/$C/$PV"
78
79 Oh wait, we do that already[1].
80
81 So really this is where I'm confused.
82 If third parties are using the package manager APIs to get at this
83 data; the only rationale to move it out of ebuilds is:
84
85 - Space savings. Certainly your scheme may be smaller, but the XML
86 tag overhead may eat into the savings. You should do some estimates
87 to show the community how much smaller the tree will be from this
88 proposal.
89
90 Randomly looking:
91
92 cd /var/cache/edb/dep/usr/portage
93 grep -hR HOMEPGE | wc -m
94 yields 1.1million characters. Each character is 1 byte (is that so in UTF8?)
95 So at best you could save the 1.2GB tree 2.2 million bytes (about 2
96 megs) if your scheme was (more than) 100% efficient.
97 The extra 1.1 million characters comes from the space freed in the
98 cache (since we don't cache metadata.xml).
99
100 2 megs into 1200 megs is.. ".166666%" of the tree. As I thought, not
101 very compelling.
102
103 Looking at DESCRIPTION:
104
105 grep -hR DESCRIPTION | wc -m
106 yields ~1.5 million characters. Nice!
107
108 So if we purge that from the cache and replace it with a (more than)
109 100% efficient metadata.xml solution we could save: 3 megs
110
111 3 megs saved + 2 megs saved = 5 megs saved. 5 / 1200 = .416666% of
112 the tree. Still again not very compelling.
113
114 - Extra Tags. Extending HOMEPAGE is harder than changing
115 metadata.xml, which I imagine is part of the reason why you proposed
116 it.
117 It will be until EAPI3 at least until we can get the HOMEPAGE tags
118 in ebuilds implemented and then we have to bump affected ebuilds
119 to EAPI3.
120
121 However if we drop the 'extra tags' bit then the only reason to move
122 the data is space, and I imagine the space savings will not be
123 compelling; but feel free to prove me wrong.
124
125 [1] For ebuilds that have cache entries, using the default cache
126 implementation for portage.
127
128 >
129 > --
130 > Diego "Flameeyes" Pettenò
131 > http://blog.flameeyes.eu/
132 >

Replies

Subject Author
[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future flameeyes@gmail.com (Diego 'Flameeyes' =?utf-8?Q?Petten=C3=B2?=)