Gentoo Archives: gentoo-dev

From: Joshua Kinard <kumba@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] RFC GLEP 1005: Package Tags
Date: Sun, 23 Mar 2014 20:27:54
Message-Id: 532F43BF.7070405@gentoo.org
In Reply to: Re: [gentoo-dev] RFC GLEP 1005: Package Tags by "Michał Górny"
1 On 03/23/2014 15:44, Micha³ Górny wrote:
2 > Dnia 2014-03-22, o godz. 15:33:27
3 > Alec Warner <antarus@g.o> napisa³(a):
4 >
5 >> https://wiki.gentoo.org/wiki/Package_Tags
6 >>
7 >> Object or forever hold your peace.
8 >
9 > Honestly, I don't think metadata.xml is a good place for it. While I
10 > like the consistency with general use of that file, I feel like it's
11 > going to make the application of tags more cumbersome, more noisy
12 > and make it harder to maintain consistency.
13 >
14 > As I see it, tags are not the same kind of package property
15 > as the description or package name. As I see it, current metadata.xml
16 > properties are somehow constant. They are usually set
17 > by the maintainer, do not change often and are strictly related only to
18 > the package.
19
20 IMHO, metadata.xml is actually the best place to describe a package with
21 tags, but I am not so sure it's the best place to "define" a tag. I guess
22 if we automate the indexing of tags, much like how use.desc.local is
23 generated from metadata.xml, then that might eliminate some of the
24 maintenance overhead.
25
26 The only way tags are going to work well is to keep the management of them
27 as automated as possible. They should only be involved in searches for
28 packages, and nothing else. E.g., hypothetical emerge command might be:
29 emerge -T mail,client, which will show me all packages with the tag of
30 'mail' and 'client' (I didn't check emerge to see if -T already has a
31 purpose, btw).
32
33 And I think we should limit the number of tags allowed per package to a
34 reasonable number. Maybe five tags maximum? StackOverflow is one example
35 where they restrict questions to five tags. In addition, SO tries to
36 suggest to you already-existing tags so that you reuse them instead of
37 creating new ones all the time. Repoman could be extended to issue a
38 warning when metadata.xml contains previously undefined tags and optionally
39 display a match of similarly-named, existing tags (if only to catch
40 misspellings, 'mial' or 'cleint' instead of 'mail' and 'client').
41
42
43 > Tags, on the other hand, are more 'live'. They place the package
44 > somewhere in the 'global' tag hierarchy that can change over time.
45 > I expect that people other than maintainers will be adding tags to
46 > packages (and changing them), and that people will invent new tags
47 > and apply them to more packages.
48 >
49 > So, first of all, your solution would mean that every commit adding
50 > a new tag or changing one of the tags would modify the package
51 > metadata.xml. This means a Manifest update and a ChangeLog entry (please
52 > don't get into more rules for ChangeLogs now), and this means it will be
53 > harder to find actually useful entries there.
54 >
55 > So we make tag updates harder, and increase time and size of rsync.
56
57 Instead of individual <tag> lines in metadata.xml for each tag, why not a
58 single <tags> line that contains a comma-delimited list of up to five tags,
59 whitespace optional? That should help reduce the "fluff" of the tree by
60 adding this feature.
61
62 E.g.,
63
64 <tags>one,two,three,four,five</tags>
65
66 vs.
67
68 <tag>one</tag>
69 <tag>two</tag>
70 <tag>three</tag>
71 <tag>four</tag>
72 <tag>five</tag>
73
74 (36 bytes vs. 82 bytes)
75
76
77 > Secondly, since tags for every package will be held in different files,
78 > people will need dedicated tools to collect tags from all those files
79 > and add matching tags to their own packages. Long story short, we're
80 > going to have many 'duplicate' tags that will require even more commits
81 > with ChangeLog entries and Manifest updates.
82
83 If we automate the generation of a master tag index file, like
84 use.desc.local, this can be avoided. emerge can simply go rummage through
85 the master index for matching tag entries instead of going through the
86 entire tree. Because if we wanted to sift through the entire tree, grep
87 would be a far better method (compiled C and probably better text-matching
88 algorithms than emerge).
89
90
91 > Worse than that, your GLEP doesn't even have any basic rules for naming
92 > tags -- like what language form to use and, say, which character to use
93 > instead of space. This sounds like the sort of things that's going to
94 > make it even harder to get some consistency, especially if some
95 > developers are going to follow someone else committing earlier and some
96 > will follow their own rules.
97
98 Easy: ASCII, alphanumeric only, must start with a letter, lowercase, no
99 spaces. A lot of problems are avoided if we keep tags to one-word
100 descriptors only. E.g., for mail clients, they would carry both 'mail' and
101 'client' as two of their five tags. For kmail, a third tag would be 'kde'
102 and Evolution would have 'gnome' instead.
103
104
105 > I'd honestly prefer that -- if we should really keep tags in the tree
106 > -- to do that with a single 'metadata/tags' file or some kind of
107 > hierarchy there. Keep them outside the package directory -- bind
108 > packages to tags, rather than tags to packages. Keep all the commits
109 > in a single place without altering the ebuild work flow.
110
111 While I definitely like the idea of a single, master file, I feel this could
112 run away pretty quickly as it's continuously updated. For example, adding a
113 new package, a dev has to now remember to add the new package's relevant
114 tags to this file, and remove its entry when that package is removed from
115 the tree. By auto-generating this file from metadata.xml contents, tag
116 management is more evenly distributed to individual package maintainers, who
117 just have to remember to add the relevant entries to metadata.xml for their
118 package's tags to get indexed, and when a package is removed, its tags will
119 also be removed.
120
121 I'd also suggest that 'all' be considered a default, global tag for all
122 packages, it be a reserved tag internal to emerge and other package
123 managers, and not count against the number of allowed tags (meaning that
124 technically, a package is allow five tags + 'all').
125
126 As for default tags when a package does not define any, the package category
127 gets split at the hyphen and becomes two independent tags. This is
128 overridden when at least one tag is defined in metadata.xml.
129
130 --
131 Joshua Kinard
132 Gentoo/MIPS
133 kumba@g.o
134 4096R/D25D95E3 2011-03-28
135
136 "The past tempts us, the present confuses us, the future frightens us. And
137 our lives slip away, moment by moment, lost in that vast, terrible in-between."
138
139 --Emperor Turhan, Centauri Republic

Replies

Subject Author
Re: [gentoo-dev] RFC GLEP 1005: Package Tags "Michał Górny" <mgorny@g.o>