1 |
On 02/11/2016 09:15 PM, Patrick Lauer wrote: |
2 |
> Now instead of looking up [metadata.xml] -> (herd name) -> [herds.xml] |
3 |
> -> email it goes backwards: |
4 |
> [metadata.xml] -> (maintainer type=project) -> email -> [projects.xml] |
5 |
> -> Project name |
6 |
> |
7 |
> Since this involves XML and python's ElementTree library it's a |
8 |
> nontrivial change that also removes a few now useless helpers |
9 |
> (_get_herd_email has no reason to be, but we'd need a _get_herd_name |
10 |
> helper instead. Err, get_proj ... ah well, whatever name works) |
11 |
> |
12 |
> And all that just so (1) gentoolkit output works and (2) euscan updates |
13 |
> properly. Both of which I don't really care about much, but now that |
14 |
> I've invested ~4h into debugging and trying to fix it I'm a tiny bit |
15 |
> IRRITATED. |
16 |
> |
17 |
So this turns out to be more fun than expected. |
18 |
|
19 |
Having spent a little bit of time staring at XML, DTDs and wondering why |
20 |
we do things the most difficult way ... |
21 |
|
22 |
Previously the herd tag was defined as: |
23 |
<!ELEMENT herd (#PCDATA)> |
24 |
|
25 |
So we end up with, for example: |
26 |
<herd>kde</herd> |
27 |
|
28 |
The new schema collapses herd (err, project!) into maintainers (err, |
29 |
sustainers ... staff ... linchpin?) |
30 |
And maintainer is defined as: |
31 |
<!ELEMENT maintainer ( email, (description| name)* )> |
32 |
|
33 |
Which means that only email is mandatory. So instead of search by name |
34 |
you are now required to search by email. |
35 |
And it leads to inconsistent (partial) duplication: Some metadata.xml |
36 |
entries carry Name, some Description, and some are Email only. |
37 |
|
38 |
For example for gentoolkit this means that instead of search by name now |
39 |
it needs to be search by email, and the previous search by name |
40 |
functionality requires herds.xml, err, projects.xml to figure out the |
41 |
name of a project. Which might not match the one in metadata.xml! |
42 |
(And you may need to filter out maintainers-that-are-not-projects, and |
43 |
what about maintainers that are undefined? So much extra code complexity!) |
44 |
|
45 |
And this is why I avoided the topic and hoped that the 'migration' would |
46 |
make sense: |
47 |
(1) Using XML is mildly insane. Neither machine- nor human-readable |
48 |
(2) The DTD is even more insane, and few people have the patience to |
49 |
figure it out |
50 |
(3) The recent changes to the DTD change the data model in subtle ways |
51 |
so that there's even *more* denormalization possible |
52 |
(4) The tooling is, due to XML, wonderfully horrible and requires things |
53 |
like XPATH to get the required data (because query by attribute is |
54 |
harder than query by tag) |
55 |
|
56 |
There's fundamental questions that should be handled before doing more |
57 |
modifications - for example, should the data be more normalized (e.g. |
58 |
name only in projects.xml / maintainers.xml and only email in |
59 |
metadata.xml)? If we allow denormalization, do we have tools to check |
60 |
and autocorrect (e.g. a maintainer changing name)? |
61 |
|
62 |
Once we decide to abstract it away so that people should use tools and |
63 |
not mangle it manually (have you looked at herds.xml ?! omg ...) there's |
64 |
the question ... why XML? It's about the worst format for this job, INI |
65 |
format is sufficient and easier to parse. Or JSON, or YAML, or whatever |
66 |
is trendy now. Or do we autogenerate from templates? |
67 |
|
68 |
Another funny thing: projects.xml is not in the same repository, so |
69 |
synchronizing changes gets more tricky. And the metadata.dtd is in yet |
70 |
another place. Wouldn't it make sense to have this organized in a less |
71 |
confusing way? |
72 |
|
73 |
You see where this is going - and why I didn't object loud enough to the |
74 |
changes: I want to not care about this whole cluster of topics and do |
75 |
things that are more rewarding. But that choice got taken away when |
76 |
things broke (oh, they didn't break, they Function Differently now) and |
77 |
I had to spend some time investigating why things deviate. |
78 |
|
79 |
Sigh. |
80 |
|
81 |
|
82 |
Am I grumpy? |