Gentoo Archives: gentoo-dev

From: Joshua Kinard <kumba@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] RFC GLEP 1005: Package Tags
Date: Sun, 23 Mar 2014 22:54:49
Message-Id: 532F6625.4030507@gentoo.org
In Reply to: Re: [gentoo-dev] RFC GLEP 1005: Package Tags by "Michał Górny"
1 On 03/23/2014 17:51, Micha³ Górny wrote:
2 > Dnia 2014-03-23, o godz. 17:40:20
3 > Joshua Kinard <kumba@g.o> napisa³(a):
4 >
5 >> On 03/23/2014 17:05, Micha³ Górny wrote:
6 >>> Dnia 2014-03-23, o godz. 16:27:43
7 >>> Joshua Kinard <kumba@g.o> napisa³(a):
8 >>>
9 >>>> On 03/23/2014 15:44, Micha³ Górny wrote:
10 >>>>> Tags, on the other hand, are more 'live'. They place the package
11 >>>>> somewhere in the 'global' tag hierarchy that can change over time.
12 >>>>> I expect that people other than maintainers will be adding tags to
13 >>>>> packages (and changing them), and that people will invent new tags
14 >>>>> and apply them to more packages.
15 >>>>>
16 >>>>> So, first of all, your solution would mean that every commit adding
17 >>>>> a new tag or changing one of the tags would modify the package
18 >>>>> metadata.xml. This means a Manifest update and a ChangeLog entry (please
19 >>>>> don't get into more rules for ChangeLogs now), and this means it will be
20 >>>>> harder to find actually useful entries there.
21 >>>>>
22 >>>>> So we make tag updates harder, and increase time and size of rsync.
23 >>>>
24 >>>> Instead of individual <tag> lines in metadata.xml for each tag, why not a
25 >>>> single <tags> line that contains a comma-delimited list of up to five tags,
26 >>>> whitespace optional? That should help reduce the "fluff" of the tree by
27 >>>> adding this feature.
28 >>>>
29 >>>> E.g.,
30 >>>>
31 >>>> <tags>one,two,three,four,five</tags>
32 >>>
33 >>> Either use XML, or don't use XML. Don't make this some kind of ugly
34 >>> mixture of XML with non-XML.
35 >>>
36 >>> So:
37 >>>
38 >>> <tags>
39 >>> <tag>one</tag>
40 >>> <tag>two</tag>
41 >>> </tags>
42 >>>
43 >>> if we're really going for this. But I guess our DTD doesn't allow easy
44 >>> definition of single <tags/> with no forced position.
45 >>
46 >> TBH, I don't like the use of XML at all. Never have and never will. I am a
47 >> big fan of INI-style definitions (i.e., like Samba's config). XML just
48 >> leads to a lot of unneeded fluff in what should be a really small file,
49 >> which is why I was proposing a single <tags> element instead of multiple
50 >> <tag> elements.
51 >
52 > metadata.xml is XML at the moment, so you are supposed to obey its
53 > rules, whether you like them or not. if you want to replace it with
54 > something else, feel free to try. But don't make a shitsoup mixin out
55 > of it.
56
57 I'm not proposing to change it now...bit too late for that. But if I ever
58 come across a TARDIS on eBay, well...
59
60 That said, Is XML that specific that every single atom has to be wrapped by
61 an individual tag? A comma-separated list of values in its own XML tag is
62 prohibited by the spec? I don't use XML often (if at all), so I am not
63 familiar with its intrinsics.
64
65
66 >>>>> Secondly, since tags for every package will be held in different files,
67 >>>>> people will need dedicated tools to collect tags from all those files
68 >>>>> and add matching tags to their own packages. Long story short, we're
69 >>>>> going to have many 'duplicate' tags that will require even more commits
70 >>>>> with ChangeLog entries and Manifest updates.
71 >>>>
72 >>>> If we automate the generation of a master tag index file, like
73 >>>> use.desc.local, this can be avoided. emerge can simply go rummage through
74 >>>> the master index for matching tag entries instead of going through the
75 >>>> entire tree. Because if we wanted to sift through the entire tree, grep
76 >>>> would be a far better method (compiled C and probably better text-matching
77 >>>> algorithms than emerge).
78 >>>
79 >>> And this goes pretty much backwards to what we were aiming at. We
80 >>> should finally kill use.desc.local, not get inspired by the redundancy.
81 >>
82 >> And what replaces it? What differentiates a global USE flag that has
83 >> purpose across multiple packages (like 'ipv6') against a flag that only
84 >> exists for a single package?
85 >
86 > Applications are supposed to read metadata.xml for local flags. That's
87 > all about it. Having an extra index file doesn't really make sense
88 > there.
89
90 But they don't currently, do they? As far as I know, most everything parses
91 the use.local.desc file. Wouldn't having portage apps read/parse every
92 package's metadata.xml file introduce a lot of disk I/O to seek out those
93 files across the entire tree? That would seem like a bigger step backwards
94 if so.
95
96
97 >>>>> Worse than that, your GLEP doesn't even have any basic rules for naming
98 >>>>> tags -- like what language form to use and, say, which character to use
99 >>>>> instead of space. This sounds like the sort of things that's going to
100 >>>>> make it even harder to get some consistency, especially if some
101 >>>>> developers are going to follow someone else committing earlier and some
102 >>>>> will follow their own rules.
103 >>>>
104 >>>> Easy: ASCII, alphanumeric only, must start with a letter, lowercase, no
105 >>>> spaces. A lot of problems are avoided if we keep tags to one-word
106 >>>> descriptors only. E.g., for mail clients, they would carry both 'mail' and
107 >>>> 'client' as two of their five tags. For kmail, a third tag would be 'kde'
108 >>>> and Evolution would have 'gnome' instead.
109 >>>
110 >>> I'm pretty sure you will finally hit something that goes with two
111 >>> words. Protocol name or something.
112 >>
113 >> Perhaps, but we can fight that battle when we get there. starting off with
114 >> one-word tags keeps things simple for now and that'll make it easier to
115 >> determine whether this experiment actually pans out or not.
116 >
117 > If you introduce arbitrary limitations, people will either find a way
118 > around them (which means getting even worse mess) or omit some tags.
119 > Either way, tags become less helpful.
120
121 Everything trends towards greater entropy, whether we like it or not.
122 Portage started with the basic idea of Ports, but it's grown way beyond that
123 over the years. USE flags were supposed to be simple switches for
124 controlling compile-time functionality, emerge used to be the only package
125 manager, and Gentoo used to only support the Linux kernel and sysvinit scripts.
126
127 Whatever implementation of tags is adopted, if any, will eventually grow
128 beyond its original design parameters. If tags are not adopted, something
129 else will probably get proposed and adopted down the road that will outgrow
130 its design parameters. The question is, are tags the best we can do *now*,
131 or do we wait for some better idea to appear down the road and then go with
132 that instead?
133
134
135 >>>> I'd also suggest that 'all' be considered a default, global tag for all
136 >>>> packages, it be a reserved tag internal to emerge and other package
137 >>>> managers, and not count against the number of allowed tags (meaning that
138 >>>> technically, a package is allow five tags + 'all').
139 >>>>
140 >>>> As for default tags when a package does not define any, the package category
141 >>>> gets split at the hyphen and becomes two independent tags. This is
142 >>>> overridden when at least one tag is defined in metadata.xml.
143 >>>
144 >>> Will this have a real benefit? Sounds like unnecessary confusion for
145 >>> a minor gain to me.
146 >>
147 >> Which? The internal 'all' tag or the use of existing category names as a
148 >> default set of tags for packages that don't have any tags defined?
149 >
150 > The 'all' tag sounds like something that would have no value.
151
152 Okay, let's ignore that then. I'm just brainstorming -- not every idea has
153 worth or merit.
154
155
156 > The automagic tags sound like a way to confuse people -- yesterday it
157 > had this tag, now I wanted to add a new one and the old tag
158 > disappeared! Not to mention sometimes the categories don't give really
159 > useful tags. Tags are not replacing categories, so no point in trying
160 > to bind the two together.
161
162 I am not suggesting that tags replace categories. Categories were the
163 original way to group packages (again, deriving from how Ports does it), so
164 when no tags are defined for a package, they offer a somewhat-suitable
165 fill-in. That's not binding the two in any direct way, it's just offering a
166 default/fallback set of tags until a package maintainer updates metadata.xml
167 to add actual tag definitions.
168
169 Sample python pseudocode:
170
171 if not package.tags:
172 package.tags = package.category.split('-')
173
174 If you have a better idea, I am definitely all ears.
175
176 --
177 Joshua Kinard
178 Gentoo/MIPS
179 kumba@g.o
180 4096R/D25D95E3 2011-03-28
181
182 "The past tempts us, the present confuses us, the future frightens us. And
183 our lives slip away, moment by moment, lost in that vast, terrible in-between."
184
185 --Emperor Turhan, Centauri Republic

Replies

Subject Author
Re: [gentoo-dev] RFC GLEP 1005: Package Tags Kent Fredric <kentfredric@×××××.com>