Gentoo Archives: gentoo-dev

From: Wyatt Epp <wyatt.epp@×××××.com>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] RFC GLEP 1005: Package Tags
Date: Mon, 24 Mar 2014 01:26:16
Message-Id: CAPCkgLm4J0pu=3fu_TbnW8UUk-BcH6a94uxJ51wPh_YR=haxaA@mail.gmail.com
In Reply to: [gentoo-dev] RFC GLEP 1005: Package Tags by Alec Warner
1 On Sat, Mar 22, 2014 at 6:33 PM, Alec Warner <antarus@g.o> wrote:
2 > https://wiki.gentoo.org/wiki/Package_Tags
3 >
4 Ack, this had to happen on a weekend when I wasn't paying attention!
5 And you beat me to it, too-- I was working on something in this vein,
6 but wasn't quite satisfied with the design yet. Oh well. You're sort
7 of on the right track, but there are some very important aspects
8 missing that will make the whole thing collapse with their absence.
9 (This thread has been in various places, but I frankly don't feel like
10 finding the relevant snippets, so you get a text dump. Sorry about
11 that.)
12
13 The first thing missing is aliasing (most proposals for this sort of
14 system miss this at first; don't feel too bad). There are many, many,
15 many cases where you want more than one single tag query to resolve to
16 the same canonical tag. The ability to define aliases that take care
17 of this automatically is critical. In my notes on this, I had a
18 global alias file, and users can have an /etc/portage/tag.alias. It's
19 just text -- nothing special -- that defines antecedent = consequent
20 relationships. This means the antecedent is _replaced_ by the
21 consequent. As a quick example, cpp = c++ This also allows for simple
22 changes to the canonical name.
23
24 Second, implication is important for decreasing maintenance burden.
25 An implication is an antecedent -> consequent relationship where the
26 consequent is automatically added if the antecedent is present.
27 Unlike aliasing, the consequent doesn't _replace_ the antecedent. An
28 example of this is acpi -> power_management, because acpi is a
29 distinct aspect of power management, and has value on its own. Over
30 time, this significantly lowers the maintenance burden of an expanding
31 vocabulary and tree.
32
33 With that in place, I want to make something clear: consistency in the
34 vocabulary is absolutely critical. I cannot overemphasise how
35 important this is. Adding tags without any sort of discipline leads
36 to an unmaintainable vocabulary, which makes the whole thing as
37 worthless as some people think. So there needs some sort of basic
38 canonical list of tags with their descriptions, and yes people should
39 be expected to be rigourous in how they approach this. I've attached
40 a rough draft of descriptions and aliases that I pulled together a
41 while ago (analogous to /etc/portage/profiles/use.desc).
42
43 This is where aliasing becomes essential, because it allows us to
44 guarantee some amount of consistency. We're only human and can't be
45 expected to cover every situation, but there's plenty of low-hanging
46 fruit in this area. e.g.:
47 app = application # Alias abbreviation to full tag
48 editors = editor # Make plural -> singular
49 aliases standard where sensible.
50 # Rule of thumb 1: "This is a(n)..."
51 admin = administration # Rule of thumb 2: "This is
52 a(n)... ...tool"
53 backup = back-up # Can use hyphenated forms
54 benchmark = benchmarking # As with admin, only gerund form.
55 cdr = disk_authoring # Spaces replaced with
56 underscores at word boundaries
57 i18n = internationalisation # Will need to come to a
58 consensus on the s/z spelling and make some aliases.
59 cpp = c++ # Valid tags should be
60 restricted to basic ASCII minus spaces (replaced with underscores) for
61 our own sanity
62 .net = dotnet # This could go either way,
63 but the leading period makes my Unix blood distrust it.
64 gamedev = game_development # "games" becomes ambiguous
65 with "game" so prefer a more-clear form.
66 lang = language = programming_language # Not to be confused with the
67 i18n language support. Avoid confusion with clear naming
68 version_control = source_control = vcs # Well known abbreviations can
69 be used in place of their expansions
70 mail = email # No sense not being clear
71 mail_server = mail_transfer_agent = mta # Multiple aliases to the same
72 thing are acceptable
73 nntp = {{newsreader usenet}} # The braced notation denotes
74 an intersection of two tags. Need to decide if this sort of alias is
75 legal. I'm thinking no, honestly.
76 sys = system # BUT it's in conflict with
77 @system! Don't do that.
78 www = web # These are all things that
79 deal with the web specifically.
80 apache = apache_module # classes of packages that
81 have their own categories is exactly why this is a good idea.
82
83 The above is just an excerpt copied directly from my notes on
84 aliasing. Some other stuff:
85 - Query syntax and semantics can be addressed in greater detail later.
86 There's some nice sugar to be had here.
87 - Likewise, tools. Something along the lines of quse and equery would
88 be handy in support of this.
89 - Aliases for reasonable search terms are not a bad idea.
90 - I've stated at various points in the past, but categories are
91 already tags after a fashion. They're not very good ones, but they're
92 a good place to start. Moreover, current metapackages and sets are
93 somewhat like tags in their own right.
94 - USEs might also be considered as a source of inspiration. That said,
95 I don't think anything like conditional tags based on the profile's
96 selected USE is a good idea. Don't make this more complex than it is.
97 - Succinctly, strongly hierarchical tags are a mistake and will cause
98 you more grief than you can imagine. Ontologically, aim for "mostly
99 flat".
100 - Limiting the number of tags allowed on a package is a horrible idea;
101 seriously, don't even consider that-- you would absolutely regret it.
102 The whole point of this is to allow useful semantic description.
103 - Crowdsourcing is something that _can_ work, but needs to be
104 moderated in some way. It could work well to deputise some trusted
105 users for this task, similar to arch testing, and they have mandate to
106 do responsible tag gardening.
107 - A good maxim for additions is "tag what you see". If it provides a
108 library with a lua bindings, then that's probably a good thing to tag.
109 - Maintainers can be awfully possessive of their packages, but on this
110 subject I think it would benefit them to unclench a little. Most
111 additions should be relatively obvious.
112 - Per-$PV tagging is honestly probably not necessary. Sticking it in
113 metadata.xml seems reasonable for now.
114
115 Regards,
116 Wyatt

Attachments

File name MIME type
alias.desc application/octet-stream
tags.desc application/octet-stream