Gentoo Archives: gentoo-dev

From:	"Michał Górny" <mgorny@g.o>
To:	Patrick Lauer <patrick@g.o>
Cc:	gentoo-dev@l.g.o
Subject:	Re: [gentoo-dev] Uncoordinated changes
Date:	Sun, 14 Feb 2016 14:37:22
Message-Id:	`20160214153706.43fd1eaf.mgorny@gentoo.org`
In Reply to:	Re: [gentoo-dev] Uncoordinated changes by Patrick Lauer

1	On Sun, 14 Feb 2016 12:37:33 +0100
2	Patrick Lauer <patrick@g.o> wrote:
3
4	> On 02/11/2016 09:15 PM, Patrick Lauer wrote:
5	> > Now instead of looking up [metadata.xml] -> (herd name) -> [herds.xml]
6	> > -> email it goes backwards:
7	> > [metadata.xml] -> (maintainer type=project) -> email -> [projects.xml]
8	> > -> Project name
9	> >
10	> > Since this involves XML and python's ElementTree library it's a
11	> > nontrivial change that also removes a few now useless helpers
12	> > (_get_herd_email has no reason to be, but we'd need a _get_herd_name
13	> > helper instead. Err, get_proj ... ah well, whatever name works)
14	> >
15	> > And all that just so (1) gentoolkit output works and (2) euscan updates
16	> > properly. Both of which I don't really care about much, but now that
17	> > I've invested ~4h into debugging and trying to fix it I'm a tiny bit
18	> > IRRITATED.
19	> >
20	> So this turns out to be more fun than expected.
21	>
22	> Having spent a little bit of time staring at XML, DTDs and wondering why
23	> we do things the most difficult way ...
24	>
25	> Previously the herd tag was defined as:
26	> <!ELEMENT herd (#PCDATA)>
27	>
28	> So we end up with, for example:
29	> <herd>kde</herd>
30	>
31	> The new schema collapses herd (err, project!) into maintainers (err,
32	> sustainers ... staff ... linchpin?)
33	> And maintainer is defined as:
34	> <!ELEMENT maintainer ( email, (description\| name)* )>
35	>
36	> Which means that only email is mandatory. So instead of search by name
37	> you are now required to search by email.
38
39	Congratulations! After the whole discussion, GLEP, explanatory blog
40	post and explanatory mails, you finally figured out how things work!
41
42	> And it leads to inconsistent (partial) duplication: Some metadata.xml
43	> entries carry Name, some Description, and some are Email only.
44	>
45	> For example for gentoolkit this means that instead of search by name now
46	> it needs to be search by email, and the previous search by name
47	> functionality requires herds.xml, err, projects.xml to figure out the
48	> name of a project. Which might not match the one in metadata.xml!
49	> (And you may need to filter out maintainers-that-are-not-projects, and
50	> what about maintainers that are undefined? So much extra code complexity!)
51
52	Everything becomes complex if you try hard to make it sound complex.
53
54	Maintainer is a single entity, and it's identified by e-mail. So if you
55	want to group packages, you just group by e-mail. Simple as that. No
56	special magic needed. No extra files needed. The basic functionality
57	works without any special needs. You can also use it straight to
58	contact the maintainer or assign bugs.
59
60	If you want to split maintainers into people and projects, you've got
61	type="". Which is also in-place, with no extra files needed.
62
63	If you want pretty project names, then you can use projects.xml.
64	Or the name in metadata.xml. Both are fine by definition.
65
66	Now, the surprise: current people-maintainers could already have
67	different (or no) names in different metadata.xml files! You didn't
68	expect that, did you? In other words, that's not a new issue, neither
69	a major problem.
70
71	> And this is why I avoided the topic and hoped that the 'migration' would
72	> make sense:
73	> (1) Using XML is mildly insane. Neither machine- nor human-readable
74
75	Wrong. It works for machines well.
76
77	> (2) The DTD is even more insane, and few people have the patience to
78	> figure it out
79
80	And what do you need the DTD for? Furthermore, it's in process of being
81	replaced.
82
83	> (3) The recent changes to the DTD change the data model in subtle ways
84	> so that there's even more denormalization possible
85	> (4) The tooling is, due to XML, wonderfully horrible and requires things
86	> like XPATH to get the required data (because query by attribute is
87	> harder than query by tag)
88
89	Of course it does. Because no modern programming languages provide such
90	complex features as conditionals!
91
92	> There's fundamental questions that should be handled before doing more
93	> modifications - for example, should the data be more normalized (e.g.
94	> name only in projects.xml / maintainers.xml and only email in
95	> metadata.xml)? If we allow denormalization, do we have tools to check
96	> and autocorrect (e.g. a maintainer changing name)?
97	>
98	> Once we decide to abstract it away so that people should use tools and
99	> not mangle it manually (have you looked at herds.xml ?! omg ...) there's
100	> the question ... why XML? It's about the worst format for this job, INI
101	> format is sufficient and easier to parse. Or JSON, or YAML, or whatever
102	> is trendy now. Or do we autogenerate from templates?
103
104	What is the gain? Who is going to fix all the tools?
105
106	> Another funny thing: projects.xml is not in the same repository, so
107	> synchronizing changes gets more tricky. And the metadata.dtd is in yet
108	> another place. Wouldn't it make sense to have this organized in a less
109	> confusing way?
110
111	projects.xml is autogenerated from wiki. Yes, the place you refuse to
112	visit. Which means you'll never exist in projects.xml.
113
114	DTDs are not needed for anything, except for doing poor man's
115	correctness verification.
116
117	> You see where this is going - and why I didn't object loud enough to the
118	> changes: I want to not care about this whole cluster of topics and do
119	> things that are more rewarding. But that choice got taken away when
120	> things broke (oh, they didn't break, they Function Differently now) and
121	> I had to spend some time investigating why things deviate.
122
123	Of course you had to. Because reading is hard.
124
125	--
126	Best regards,
127	Michał Górny
128	<http://dev.gentoo.org/~mgorny/>

Report Message

Find on MARC Find on Google Groups