Gentoo Archives: gentoo-dev

From: Joshua Kinard <kumba@g.o>
To: "Michał Górny" <mgorny@g.o>, gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] RFC GLEP 1005: Package Tags
Date: Sun, 23 Mar 2014 21:40:36
Message-Id: 532F54C4.7080205@gentoo.org
In Reply to: Re: [gentoo-dev] RFC GLEP 1005: Package Tags by "Michał Górny"
1 On 03/23/2014 17:05, Micha³ Górny wrote:
2 > Dnia 2014-03-23, o godz. 16:27:43
3 > Joshua Kinard <kumba@g.o> napisa³(a):
4 >
5 >> On 03/23/2014 15:44, Micha³ Górny wrote:
6 >>> Tags, on the other hand, are more 'live'. They place the package
7 >>> somewhere in the 'global' tag hierarchy that can change over time.
8 >>> I expect that people other than maintainers will be adding tags to
9 >>> packages (and changing them), and that people will invent new tags
10 >>> and apply them to more packages.
11 >>>
12 >>> So, first of all, your solution would mean that every commit adding
13 >>> a new tag or changing one of the tags would modify the package
14 >>> metadata.xml. This means a Manifest update and a ChangeLog entry (please
15 >>> don't get into more rules for ChangeLogs now), and this means it will be
16 >>> harder to find actually useful entries there.
17 >>>
18 >>> So we make tag updates harder, and increase time and size of rsync.
19 >>
20 >> Instead of individual <tag> lines in metadata.xml for each tag, why not a
21 >> single <tags> line that contains a comma-delimited list of up to five tags,
22 >> whitespace optional? That should help reduce the "fluff" of the tree by
23 >> adding this feature.
24 >>
25 >> E.g.,
26 >>
27 >> <tags>one,two,three,four,five</tags>
28 >
29 > Either use XML, or don't use XML. Don't make this some kind of ugly
30 > mixture of XML with non-XML.
31 >
32 > So:
33 >
34 > <tags>
35 > <tag>one</tag>
36 > <tag>two</tag>
37 > </tags>
38 >
39 > if we're really going for this. But I guess our DTD doesn't allow easy
40 > definition of single <tags/> with no forced position.
41
42 TBH, I don't like the use of XML at all. Never have and never will. I am a
43 big fan of INI-style definitions (i.e., like Samba's config). XML just
44 leads to a lot of unneeded fluff in what should be a really small file,
45 which is why I was proposing a single <tags> element instead of multiple
46 <tag> elements.
47
48 E.g., instead for local USE of this:
49
50 <use>
51 <flag name='foo'>FOO</flag>
52 <flag name='bar'>BAR</flag>
53 <flag name='baz'>BAZ</flag>
54 </use>
55
56 (96 bytes)
57
58 This would be better:
59
60 [local use]
61 foo = "FOO"
62 bar = "BAR"
63 baz = "BAZ"
64
65 (47 bytes)
66
67 Not a complicated example, but would be >50% reduction in size. But, I
68 digress...
69
70
71 >>> Secondly, since tags for every package will be held in different files,
72 >>> people will need dedicated tools to collect tags from all those files
73 >>> and add matching tags to their own packages. Long story short, we're
74 >>> going to have many 'duplicate' tags that will require even more commits
75 >>> with ChangeLog entries and Manifest updates.
76 >>
77 >> If we automate the generation of a master tag index file, like
78 >> use.desc.local, this can be avoided. emerge can simply go rummage through
79 >> the master index for matching tag entries instead of going through the
80 >> entire tree. Because if we wanted to sift through the entire tree, grep
81 >> would be a far better method (compiled C and probably better text-matching
82 >> algorithms than emerge).
83 >
84 > And this goes pretty much backwards to what we were aiming at. We
85 > should finally kill use.desc.local, not get inspired by the redundancy.
86
87 And what replaces it? What differentiates a global USE flag that has
88 purpose across multiple packages (like 'ipv6') against a flag that only
89 exists for a single package?
90
91 I'll agree that USE flags have definitely gotten out of control, and the
92 trend now seems to be moving sharply away from defining a global USE
93 definition in make.conf instead to per-package USE flags in
94 /etc/portage/package.use. Which, while offering more granular control, can
95 be mind-numbingly annoying at times.
96
97 The automated generation of use.local.desc definitely made maintenance of
98 some things easier. We've gotta index USE flags some how, and separating
99 them into global and local categories still makes sense to me. But, I'm
100 probably just going senile...
101
102
103 >>> Worse than that, your GLEP doesn't even have any basic rules for naming
104 >>> tags -- like what language form to use and, say, which character to use
105 >>> instead of space. This sounds like the sort of things that's going to
106 >>> make it even harder to get some consistency, especially if some
107 >>> developers are going to follow someone else committing earlier and some
108 >>> will follow their own rules.
109 >>
110 >> Easy: ASCII, alphanumeric only, must start with a letter, lowercase, no
111 >> spaces. A lot of problems are avoided if we keep tags to one-word
112 >> descriptors only. E.g., for mail clients, they would carry both 'mail' and
113 >> 'client' as two of their five tags. For kmail, a third tag would be 'kde'
114 >> and Evolution would have 'gnome' instead.
115 >
116 > I'm pretty sure you will finally hit something that goes with two
117 > words. Protocol name or something.
118
119 Perhaps, but we can fight that battle when we get there. starting off with
120 one-word tags keeps things simple for now and that'll make it easier to
121 determine whether this experiment actually pans out or not.
122
123
124 >> I'd also suggest that 'all' be considered a default, global tag for all
125 >> packages, it be a reserved tag internal to emerge and other package
126 >> managers, and not count against the number of allowed tags (meaning that
127 >> technically, a package is allow five tags + 'all').
128 >>
129 >> As for default tags when a package does not define any, the package category
130 >> gets split at the hyphen and becomes two independent tags. This is
131 >> overridden when at least one tag is defined in metadata.xml.
132 >
133 > Will this have a real benefit? Sounds like unnecessary confusion for
134 > a minor gain to me.
135
136 Which? The internal 'all' tag or the use of existing category names as a
137 default set of tags for packages that don't have any tags defined?
138
139 The 'all' thing is probably unnecessary, as the same effect can be done with
140 wildcarding or some other programming trick. The latter is just a way to
141 avoid having to handle the lack of tags. Because if this is implemented,
142 it's going to take years for most of the packages in the tree to get tags
143 assigned to them. By having a default set of tags to link most packages to,
144 it makes finding them via a tag search easy. E.g., even if a particular
145 package in dev-python lacks tags, you can still find it by searching for the
146 tag "python".
147
148 Granted, a tag of "dev" offers no value (dev-python -> 'dev','python'), but
149 if you were looking for a web browser versus a web server, having default
150 tags of 'www','client' or 'www','servers' helps for packages in www-client
151 and www-servers.
152
153
154 Tags aside, wasn't there a proposal long ago to re-categorize the entire
155 tree because someone felt that the double-atom naming mechanism for
156 categories (atom1-atom2) wasn't flexible nor descriptive enough? The entire
157 Portage tree idea derives from Ports, and it's really ballooned over the
158 years, while a modern-day Ports tree in /usr/ports is still pretty small and
159 self-contained. I've always wondered is we allowed portage to have one
160 additional level of nesting if that'd help any (i.e., games-* -> games/*).
161 It really seems like this is what tags is attempting to solve, so maybe that
162 problem needs to be revisited instead.
163
164
165 --
166 Joshua Kinard
167 Gentoo/MIPS
168 kumba@g.o
169 4096R/D25D95E3 2011-03-28
170
171 "The past tempts us, the present confuses us, the future frightens us. And
172 our lives slip away, moment by moment, lost in that vast, terrible in-between."
173
174 --Emperor Turhan, Centauri Republic

Replies

Subject Author
Re: [gentoo-dev] RFC GLEP 1005: Package Tags "Michał Górny" <mgorny@g.o>
Re: [gentoo-dev] RFC GLEP 1005: Package Tags Jan Matejka <yac@g.o>