Gentoo Archives: gentoo-dev

From: Patrick Lauer <patrick@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Uncoordinated changes
Date: Sun, 14 Feb 2016 11:38:36
Message-Id: 56C066FD.9010106@gentoo.org
In Reply to: [gentoo-dev] Uncoordinated changes by Patrick Lauer
1 On 02/11/2016 09:15 PM, Patrick Lauer wrote:
2 > Now instead of looking up [metadata.xml] -> (herd name) -> [herds.xml]
3 > -> email it goes backwards:
4 > [metadata.xml] -> (maintainer type=project) -> email -> [projects.xml]
5 > -> Project name
6 >
7 > Since this involves XML and python's ElementTree library it's a
8 > nontrivial change that also removes a few now useless helpers
9 > (_get_herd_email has no reason to be, but we'd need a _get_herd_name
10 > helper instead. Err, get_proj ... ah well, whatever name works)
11 >
12 > And all that just so (1) gentoolkit output works and (2) euscan updates
13 > properly. Both of which I don't really care about much, but now that
14 > I've invested ~4h into debugging and trying to fix it I'm a tiny bit
15 > IRRITATED.
16 >
17 So this turns out to be more fun than expected.
18
19 Having spent a little bit of time staring at XML, DTDs and wondering why
20 we do things the most difficult way ...
21
22 Previously the herd tag was defined as:
23 <!ELEMENT herd (#PCDATA)>
24
25 So we end up with, for example:
26 <herd>kde</herd>
27
28 The new schema collapses herd (err, project!) into maintainers (err,
29 sustainers ... staff ... linchpin?)
30 And maintainer is defined as:
31 <!ELEMENT maintainer ( email, (description| name)* )>
32
33 Which means that only email is mandatory. So instead of search by name
34 you are now required to search by email.
35 And it leads to inconsistent (partial) duplication: Some metadata.xml
36 entries carry Name, some Description, and some are Email only.
37
38 For example for gentoolkit this means that instead of search by name now
39 it needs to be search by email, and the previous search by name
40 functionality requires herds.xml, err, projects.xml to figure out the
41 name of a project. Which might not match the one in metadata.xml!
42 (And you may need to filter out maintainers-that-are-not-projects, and
43 what about maintainers that are undefined? So much extra code complexity!)
44
45 And this is why I avoided the topic and hoped that the 'migration' would
46 make sense:
47 (1) Using XML is mildly insane. Neither machine- nor human-readable
48 (2) The DTD is even more insane, and few people have the patience to
49 figure it out
50 (3) The recent changes to the DTD change the data model in subtle ways
51 so that there's even *more* denormalization possible
52 (4) The tooling is, due to XML, wonderfully horrible and requires things
53 like XPATH to get the required data (because query by attribute is
54 harder than query by tag)
55
56 There's fundamental questions that should be handled before doing more
57 modifications - for example, should the data be more normalized (e.g.
58 name only in projects.xml / maintainers.xml and only email in
59 metadata.xml)? If we allow denormalization, do we have tools to check
60 and autocorrect (e.g. a maintainer changing name)?
61
62 Once we decide to abstract it away so that people should use tools and
63 not mangle it manually (have you looked at herds.xml ?! omg ...) there's
64 the question ... why XML? It's about the worst format for this job, INI
65 format is sufficient and easier to parse. Or JSON, or YAML, or whatever
66 is trendy now. Or do we autogenerate from templates?
67
68 Another funny thing: projects.xml is not in the same repository, so
69 synchronizing changes gets more tricky. And the metadata.dtd is in yet
70 another place. Wouldn't it make sense to have this organized in a less
71 confusing way?
72
73 You see where this is going - and why I didn't object loud enough to the
74 changes: I want to not care about this whole cluster of topics and do
75 things that are more rewarding. But that choice got taken away when
76 things broke (oh, they didn't break, they Function Differently now) and
77 I had to spend some time investigating why things deviate.
78
79 Sigh.
80
81
82 Am I grumpy?

Replies

Subject Author
Re: [gentoo-dev] Uncoordinated changes Rich Freeman <rich0@g.o>
Re: [gentoo-dev] Uncoordinated changes Kent Fredric <kentfredric@×××××.com>
Re: [gentoo-dev] Uncoordinated changes "Michał Górny" <mgorny@g.o>