Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev@l.g.o
Cc: qa@g.o, Gentoo Council <council@g.o>
Subject: [gentoo-dev] XML Schema files for metadata.xml, projects.xml and repositories.xml, for review and testing
Date: Sun, 06 Mar 2016 11:01:41
Message-Id: 20160306120119.2df280d6.mgorny@gentoo.org
1 Hello, everyone.
2
3 As you may be aware, we were considering replacing the DTD files for
4 our XML documents with a more modern and more complete format. As part
5 of considering options for this, I've written XML Schema files [1] that
6 provide a more correct replacement for the current DTD files and I'd
7 like you to review it.
8
9 XML Schema not only allows us to express our data formats more
10 correctly than DTD but also gives some degree of value checks.
11 In particular, it finds a number of existing issues that DTD can't
12 find.
13
14 For example:
15
16 $ xmllint --noout --schema projects.xsd projects.xml
17 projects.xml:864: element project: Schemas validity error : Element
18 'project': Duplicate key-sequence ['desktop-misc@g.o'] in key
19 identity-constraint 'projectKey'. projects.xml fails to validate
20
21 which means there are two projects using the same e-mail address
22 (and therefore being ambiguous for metadata references).
23
24 Aside to the usual structure errors, my schemas find:
25
26 - duplicate keys (project e-mails, repository names),
27
28 - duplicate supposedly-unique values (like duplicate
29 <longdescription/>s in the same language),
30
31 - some data well-formedness errors (e.g. <pkg/> tags referencing
32 things that are not correct qualified package names),
33
34 - some random weirdnesses (like using multiple <use/> blocks for
35 flags, for no good reason).
36
37 I should note that I've based those schemas on existing DTDs, PMS
38 and some understanding guesswork, so they may be over- or understrict.
39 If someone can provide better PMS-y package name regexps, I'd
40 appreciate.
41
42 Please test and review. I'm going to reply to this mail with the list
43 of current metadata.xml validation failures (it's quite long).
44
45 [1]:https://github.com/mgorny/gentoo-xml-schema
46
47 --
48 Best regards,
49 Michał Górny
50 <http://dev.gentoo.org/~mgorny/>

Replies