1 |
On Sun, 14 Feb 2016 12:37:33 +0100 |
2 |
Patrick Lauer <patrick@g.o> wrote: |
3 |
|
4 |
> On 02/11/2016 09:15 PM, Patrick Lauer wrote: |
5 |
> > Now instead of looking up [metadata.xml] -> (herd name) -> [herds.xml] |
6 |
> > -> email it goes backwards: |
7 |
> > [metadata.xml] -> (maintainer type=project) -> email -> [projects.xml] |
8 |
> > -> Project name |
9 |
> > |
10 |
> > Since this involves XML and python's ElementTree library it's a |
11 |
> > nontrivial change that also removes a few now useless helpers |
12 |
> > (_get_herd_email has no reason to be, but we'd need a _get_herd_name |
13 |
> > helper instead. Err, get_proj ... ah well, whatever name works) |
14 |
> > |
15 |
> > And all that just so (1) gentoolkit output works and (2) euscan updates |
16 |
> > properly. Both of which I don't really care about much, but now that |
17 |
> > I've invested ~4h into debugging and trying to fix it I'm a tiny bit |
18 |
> > IRRITATED. |
19 |
> > |
20 |
> So this turns out to be more fun than expected. |
21 |
> |
22 |
> Having spent a little bit of time staring at XML, DTDs and wondering why |
23 |
> we do things the most difficult way ... |
24 |
> |
25 |
> Previously the herd tag was defined as: |
26 |
> <!ELEMENT herd (#PCDATA)> |
27 |
> |
28 |
> So we end up with, for example: |
29 |
> <herd>kde</herd> |
30 |
> |
31 |
> The new schema collapses herd (err, project!) into maintainers (err, |
32 |
> sustainers ... staff ... linchpin?) |
33 |
> And maintainer is defined as: |
34 |
> <!ELEMENT maintainer ( email, (description| name)* )> |
35 |
> |
36 |
> Which means that only email is mandatory. So instead of search by name |
37 |
> you are now required to search by email. |
38 |
|
39 |
Congratulations! After the whole discussion, GLEP, explanatory blog |
40 |
post and explanatory mails, you finally figured out how things work! |
41 |
|
42 |
> And it leads to inconsistent (partial) duplication: Some metadata.xml |
43 |
> entries carry Name, some Description, and some are Email only. |
44 |
> |
45 |
> For example for gentoolkit this means that instead of search by name now |
46 |
> it needs to be search by email, and the previous search by name |
47 |
> functionality requires herds.xml, err, projects.xml to figure out the |
48 |
> name of a project. Which might not match the one in metadata.xml! |
49 |
> (And you may need to filter out maintainers-that-are-not-projects, and |
50 |
> what about maintainers that are undefined? So much extra code complexity!) |
51 |
|
52 |
Everything becomes complex if you try hard to make it sound complex. |
53 |
|
54 |
Maintainer is a single entity, and it's identified by e-mail. So if you |
55 |
want to group packages, you just group by e-mail. Simple as that. No |
56 |
special magic needed. No extra files needed. The basic functionality |
57 |
works without any special needs. You can also use it straight to |
58 |
contact the maintainer or assign bugs. |
59 |
|
60 |
If you want to split maintainers into people and projects, you've got |
61 |
type="". Which is also in-place, with no extra files needed. |
62 |
|
63 |
If you want pretty project names, then you can use projects.xml. |
64 |
Or the name in metadata.xml. Both are fine by definition. |
65 |
|
66 |
Now, the surprise: current people-maintainers could already have |
67 |
different (or no) names in different metadata.xml files! You didn't |
68 |
expect that, did you? In other words, that's not a new issue, neither |
69 |
a major problem. |
70 |
|
71 |
> And this is why I avoided the topic and hoped that the 'migration' would |
72 |
> make sense: |
73 |
> (1) Using XML is mildly insane. Neither machine- nor human-readable |
74 |
|
75 |
Wrong. It works for machines well. |
76 |
|
77 |
> (2) The DTD is even more insane, and few people have the patience to |
78 |
> figure it out |
79 |
|
80 |
And what do you need the DTD for? Furthermore, it's in process of being |
81 |
replaced. |
82 |
|
83 |
> (3) The recent changes to the DTD change the data model in subtle ways |
84 |
> so that there's even *more* denormalization possible |
85 |
> (4) The tooling is, due to XML, wonderfully horrible and requires things |
86 |
> like XPATH to get the required data (because query by attribute is |
87 |
> harder than query by tag) |
88 |
|
89 |
Of course it does. Because no modern programming languages provide such |
90 |
complex features as conditionals! |
91 |
|
92 |
> There's fundamental questions that should be handled before doing more |
93 |
> modifications - for example, should the data be more normalized (e.g. |
94 |
> name only in projects.xml / maintainers.xml and only email in |
95 |
> metadata.xml)? If we allow denormalization, do we have tools to check |
96 |
> and autocorrect (e.g. a maintainer changing name)? |
97 |
> |
98 |
> Once we decide to abstract it away so that people should use tools and |
99 |
> not mangle it manually (have you looked at herds.xml ?! omg ...) there's |
100 |
> the question ... why XML? It's about the worst format for this job, INI |
101 |
> format is sufficient and easier to parse. Or JSON, or YAML, or whatever |
102 |
> is trendy now. Or do we autogenerate from templates? |
103 |
|
104 |
What is the gain? Who is going to fix all the tools? |
105 |
|
106 |
> Another funny thing: projects.xml is not in the same repository, so |
107 |
> synchronizing changes gets more tricky. And the metadata.dtd is in yet |
108 |
> another place. Wouldn't it make sense to have this organized in a less |
109 |
> confusing way? |
110 |
|
111 |
projects.xml is autogenerated from wiki. Yes, the place you refuse to |
112 |
visit. Which means you'll never exist in projects.xml. |
113 |
|
114 |
DTDs are not needed for anything, except for doing poor man's |
115 |
correctness verification. |
116 |
|
117 |
> You see where this is going - and why I didn't object loud enough to the |
118 |
> changes: I want to not care about this whole cluster of topics and do |
119 |
> things that are more rewarding. But that choice got taken away when |
120 |
> things broke (oh, they didn't break, they Function Differently now) and |
121 |
> I had to spend some time investigating why things deviate. |
122 |
|
123 |
Of course you had to. Because reading is hard. |
124 |
|
125 |
-- |
126 |
Best regards, |
127 |
Michał Górny |
128 |
<http://dev.gentoo.org/~mgorny/> |