1 |
On Sat, Mar 22, 2014 at 6:33 PM, Alec Warner <antarus@g.o> wrote: |
2 |
> https://wiki.gentoo.org/wiki/Package_Tags |
3 |
> |
4 |
Ack, this had to happen on a weekend when I wasn't paying attention! |
5 |
And you beat me to it, too-- I was working on something in this vein, |
6 |
but wasn't quite satisfied with the design yet. Oh well. You're sort |
7 |
of on the right track, but there are some very important aspects |
8 |
missing that will make the whole thing collapse with their absence. |
9 |
(This thread has been in various places, but I frankly don't feel like |
10 |
finding the relevant snippets, so you get a text dump. Sorry about |
11 |
that.) |
12 |
|
13 |
The first thing missing is aliasing (most proposals for this sort of |
14 |
system miss this at first; don't feel too bad). There are many, many, |
15 |
many cases where you want more than one single tag query to resolve to |
16 |
the same canonical tag. The ability to define aliases that take care |
17 |
of this automatically is critical. In my notes on this, I had a |
18 |
global alias file, and users can have an /etc/portage/tag.alias. It's |
19 |
just text -- nothing special -- that defines antecedent = consequent |
20 |
relationships. This means the antecedent is _replaced_ by the |
21 |
consequent. As a quick example, cpp = c++ This also allows for simple |
22 |
changes to the canonical name. |
23 |
|
24 |
Second, implication is important for decreasing maintenance burden. |
25 |
An implication is an antecedent -> consequent relationship where the |
26 |
consequent is automatically added if the antecedent is present. |
27 |
Unlike aliasing, the consequent doesn't _replace_ the antecedent. An |
28 |
example of this is acpi -> power_management, because acpi is a |
29 |
distinct aspect of power management, and has value on its own. Over |
30 |
time, this significantly lowers the maintenance burden of an expanding |
31 |
vocabulary and tree. |
32 |
|
33 |
With that in place, I want to make something clear: consistency in the |
34 |
vocabulary is absolutely critical. I cannot overemphasise how |
35 |
important this is. Adding tags without any sort of discipline leads |
36 |
to an unmaintainable vocabulary, which makes the whole thing as |
37 |
worthless as some people think. So there needs some sort of basic |
38 |
canonical list of tags with their descriptions, and yes people should |
39 |
be expected to be rigourous in how they approach this. I've attached |
40 |
a rough draft of descriptions and aliases that I pulled together a |
41 |
while ago (analogous to /etc/portage/profiles/use.desc). |
42 |
|
43 |
This is where aliasing becomes essential, because it allows us to |
44 |
guarantee some amount of consistency. We're only human and can't be |
45 |
expected to cover every situation, but there's plenty of low-hanging |
46 |
fruit in this area. e.g.: |
47 |
app = application # Alias abbreviation to full tag |
48 |
editors = editor # Make plural -> singular |
49 |
aliases standard where sensible. |
50 |
# Rule of thumb 1: "This is a(n)..." |
51 |
admin = administration # Rule of thumb 2: "This is |
52 |
a(n)... ...tool" |
53 |
backup = back-up # Can use hyphenated forms |
54 |
benchmark = benchmarking # As with admin, only gerund form. |
55 |
cdr = disk_authoring # Spaces replaced with |
56 |
underscores at word boundaries |
57 |
i18n = internationalisation # Will need to come to a |
58 |
consensus on the s/z spelling and make some aliases. |
59 |
cpp = c++ # Valid tags should be |
60 |
restricted to basic ASCII minus spaces (replaced with underscores) for |
61 |
our own sanity |
62 |
.net = dotnet # This could go either way, |
63 |
but the leading period makes my Unix blood distrust it. |
64 |
gamedev = game_development # "games" becomes ambiguous |
65 |
with "game" so prefer a more-clear form. |
66 |
lang = language = programming_language # Not to be confused with the |
67 |
i18n language support. Avoid confusion with clear naming |
68 |
version_control = source_control = vcs # Well known abbreviations can |
69 |
be used in place of their expansions |
70 |
mail = email # No sense not being clear |
71 |
mail_server = mail_transfer_agent = mta # Multiple aliases to the same |
72 |
thing are acceptable |
73 |
nntp = {{newsreader usenet}} # The braced notation denotes |
74 |
an intersection of two tags. Need to decide if this sort of alias is |
75 |
legal. I'm thinking no, honestly. |
76 |
sys = system # BUT it's in conflict with |
77 |
@system! Don't do that. |
78 |
www = web # These are all things that |
79 |
deal with the web specifically. |
80 |
apache = apache_module # classes of packages that |
81 |
have their own categories is exactly why this is a good idea. |
82 |
|
83 |
The above is just an excerpt copied directly from my notes on |
84 |
aliasing. Some other stuff: |
85 |
- Query syntax and semantics can be addressed in greater detail later. |
86 |
There's some nice sugar to be had here. |
87 |
- Likewise, tools. Something along the lines of quse and equery would |
88 |
be handy in support of this. |
89 |
- Aliases for reasonable search terms are not a bad idea. |
90 |
- I've stated at various points in the past, but categories are |
91 |
already tags after a fashion. They're not very good ones, but they're |
92 |
a good place to start. Moreover, current metapackages and sets are |
93 |
somewhat like tags in their own right. |
94 |
- USEs might also be considered as a source of inspiration. That said, |
95 |
I don't think anything like conditional tags based on the profile's |
96 |
selected USE is a good idea. Don't make this more complex than it is. |
97 |
- Succinctly, strongly hierarchical tags are a mistake and will cause |
98 |
you more grief than you can imagine. Ontologically, aim for "mostly |
99 |
flat". |
100 |
- Limiting the number of tags allowed on a package is a horrible idea; |
101 |
seriously, don't even consider that-- you would absolutely regret it. |
102 |
The whole point of this is to allow useful semantic description. |
103 |
- Crowdsourcing is something that _can_ work, but needs to be |
104 |
moderated in some way. It could work well to deputise some trusted |
105 |
users for this task, similar to arch testing, and they have mandate to |
106 |
do responsible tag gardening. |
107 |
- A good maxim for additions is "tag what you see". If it provides a |
108 |
library with a lua bindings, then that's probably a good thing to tag. |
109 |
- Maintainers can be awfully possessive of their packages, but on this |
110 |
subject I think it would benefit them to unclench a little. Most |
111 |
additions should be relatively obvious. |
112 |
- Per-$PV tagging is honestly probably not necessary. Sticking it in |
113 |
metadata.xml seems reasonable for now. |
114 |
|
115 |
Regards, |
116 |
Wyatt |