Gentoo Archives: gentoo-dev

From: "Ulrich Müller" <ulm@g.o>
To: gentoo-dev@l.g.o
Cc: "Ulrich Müller" <ulm@g.o>
Subject: [gentoo-dev] [PATCH 0/1] GLEP 68: Allow EAPI 5 dependency specifications
Date: Wed, 22 Feb 2023 16:07:39
Message-Id: 20230222160704.6748-1-ulm@gentoo.org
1 This was discussed in #gentoo-council a while ago.
2
3 There is no EAPI identification mechanism in metadata.xml, but EAPI 5
4 is specified in the top-level profile directory, and it has been
5 supported by Portage for 10+ years.
6
7 Find the updated full text of GLEP 68 below, and a diff in the next
8 message.
9
10 Ulrich Müller (1):
11 glep-0068: Allow EAPI 5 dependency specifications
12
13 glep-0068.rst | 16 +++++++++++-----
14 1 file changed, 11 insertions(+), 5 deletions(-)
15
16 --
17 2.39.2
18
19 ---
20 GLEP: 68
21 Title: Package and category metadata
22 Author: Michał Górny <mgorny@g.o>
23 Type: Standards Track
24 Status: Final
25 Version: 1.4
26 Created: 2016-03-14
27 Last-Modified: 2023-01-22
28 Post-History: 2016-03-16, 2018-02-20, 2022-05-22, 2022-10-07
29 Content-Type: text/x-rst
30 Requires: 67
31 Replaces: 34, 46, 56
32 ---
33
34 Abstract
35 ========
36
37 This GLEP specifies the format of files used to describe category and package
38 metadata (``metadata.xml``).
39
40
41 Motivation
42 ==========
43
44 At the moment of writing this GLEP, category and package ``metadata.xml``
45 lacked proper specification. PMS Appendix A [#PMS-A]_ specified that
46 the format of this file is beyond its scope, deferring the specification
47 to the DTD file.
48
49 The original metadata.dtd file [#METADATA-DTD]_ (the version before cleanups
50 related to this spec) did not serve well as the specification. Due to
51 the technical limitations on DTD format, it was both unable to enforce
52 the specification fully and explain it in a readable form. Furthermore,
53 it lacked some important details such as the format of ``<pkg/>`` entries.
54
55 Besides that, there were numerous alterations to the format. GLEP 34 added
56 metadata files for category descriptions, GLEP 46 added upstream information,
57 GLEP 56 added USE flag descriptions, GLEP 67 altered the maintainer
58 descriptions. Furthermore, there were additions and removals done without
59 a formal specification, e.g. addition of slot descriptions.
60
61 Sadly, some of those GLEPs are partially in conflict with other specifications
62 — for example, the ``<pkg/>`` element as described in GLEP 56 is different
63 than the one originally proposed and used in metadata.xml.
64
65 Therefore, the motivation for this GLEP is to provide unified, clear
66 and complete specification for both category-wide and package-wide
67 metadata.xml files. It is meant to combine previous GLEPs, relevant
68 discussions and implementation in order to provide the specification that is
69 closest to the originally intended meaning while preserving best compatibility
70 with existing tools and data.
71
72
73 Specification
74 =============
75
76 Metadata files
77 --------------
78
79 This specification provides two kinds of metadata files: category metadata
80 files and package metadata files. Both kinds of files use the XML 1.0 file
81 format [#XML10]_. They must not use external markup declarations, as defined
82 in the XML specification. While they may reference or include a DTD, the parser
83 must not fetch or process it.
84
85 The data structure of metadata files is defined in this GLEP. The elements
86 and attributes do not use namespaces. Conforming files must not contain
87 any elements or attributes that are not defined in this specification.
88 However, parsers should ignore any unknown elements or attributes in order
89 to permit future extension.
90
91 Category metadata files are named ``metadata.xml`` and located inside category
92 directories in an ebuild repository. Their structure is described
93 in `Category metadata`_ section.
94
95 Package metadata files are named ``metadata.xml`` and located inside package
96 directories in an ebuild repository. Their structure is described
97 in `Package metadata`_ section.
98
99 Text data
100 ---------
101
102 The following text data types are used:
103
104 - text data,
105 - multi-line text data.
106
107 In case of text data, all whitespace inside the element is normalized
108 (consecutive whitespace sequences are replaced by a single SP). Trailing
109 and leading whitespace is stripped.
110
111 In case of multi-line text data, all whitespace except for newline characters
112 is normalized. Newlines are used to delimit lines of text. Leading
113 and trailing lines of text that are either empty or consist purely of
114 whitespace are stripped. Afterwards, the whitespace belonging to
115 the indentation common to all non-empty lines of text is stripped.
116
117 Optionally, interspersing text with ``<cat/>`` and ``<pkg/>`` elements can be
118 allowed. In this case, ``<cat/>`` element is used to reference a category
119 inside the repository, and must contain a valid category name. ``<pkg/>``
120 is used to reference a package, and must contain a valid qualified package
121 name.
122
123 Common attributes
124 -----------------
125
126 The following common attributes are allowed on multiple elements:
127
128 - language specifiers,
129 - restriction specifiers.
130
131 Language specifiers are used whenever an element supports variants
132 in different languages. In this case, each occurrence of the element may
133 contain an optional ``lang=""`` attribute that contains an IETF language tag
134 [#BCP-47]_. In case no ``lang=""`` attribute is provided, an implicit default
135 of ``en`` is assumed.
136
137 Restriction specifiers are used whenever an element supports restricting to
138 specific package versions. In this case, each occurence of the element may
139 contain an optional ``restrict=""`` attribute that contains an EAPI 5
140 dependency specification that has to match one or more versions of the
141 package. In this case, the metadata provided by the element applies only to
142 the package versions matching the restriction.
143
144 Category metadata
145 -----------------
146
147 The category metadata file uses ``<catmetadata/>`` top-level element. This
148 element can contain, in any order:
149
150 - zero or more ``<longdescription/>`` elements containing category
151 descriptions in different languages (at most one for each language).
152 The category description is formed of multi-line text, optionally
153 interspersed with ``<cat/>`` and ``<pkg/>`` elements.
154
155 Package metadata
156 ----------------
157 Top-level structure
158 ~~~~~~~~~~~~~~~~~~~
159 The package metadata file uses ``<pkgmetadata/>`` top-level element. This
160 element can contain, in any order:
161
162 - zero or more ``<longdescription/>`` elements containing package descriptions
163 in different languages, possibly restricted to specific package versions
164 (at most one for each combination of language and package version).
165 The package description is formed of multi-line text, optionally
166 interspersed with ``<cat/>`` and ``<pkg/>`` elements.
167
168 - zero or more ``<maintainer/>`` elements listing package maintainers,
169 optionally restricted to specific package versions. The maintainer format
170 is detailed in `Maintainer descriptions`_.
171
172 - zero or more ``<slots/>`` elements containing slot descriptions in different
173 languages (at most one for each language), as detailed
174 in `Slot descriptions`_.
175
176 - zero or more ``<stabilize-allarches/>`` elements, possibly restricted
177 to specific package versions (at most one for each version) whose presence
178 indicates that the appropriate ebuilds are suitable for simultaneously
179 marking stable on all architectures where a previous version is stable
180 after arch testing on one of them (i.e. if the package is known to be fully
181 arch-independent).
182
183 - zero or more ``<use/>`` elements containing USE flag descriptions
184 in different languages (at most one for each language), as detailed
185 in `USE flag descriptions`_.
186
187 - at most one ``<upstream/>`` element providing information on upstream
188 of the package, as detailed in `Upstream descriptions`_.
189
190 Maintainer descriptions
191 ~~~~~~~~~~~~~~~~~~~~~~~
192 Each ``<maintainer/>`` element describes a single maintainer.
193
194 The ``<maintainer/>`` element has an obligatory ``type=""`` attribute whose
195 value can be either ``person`` or ``project``.
196
197 The ``<maintainer/>`` element contains the following elements, in any order:
198
199 - exactly one ``<email/>`` element that contains the maintainer's e-mail
200 address (used as unique identifier),
201
202 - at most one ``<name/>`` element that contains the maintainer's
203 human-readable name (real name or nickname),
204
205 - zero or more ``<description/>`` elements that explain the role
206 of the maintainer in different languages (at most one ``<description/>``
207 for each language).
208
209 Slot descriptions
210 ~~~~~~~~~~~~~~~~~
211 Each ``<slots/>`` element describes slots of a package (in specific language).
212
213 The ``<slots/>`` element can contain the following elements:
214
215 - zero or more ``<slot/>`` elements describing specific ebuild slots
216 (at most one for each slot name).
217 The ``<slot/>`` element contains an obligatory ``name=""`` attribute stating
218 the slot to which the description applies, and contains slot description as
219 text. Alternatively, a slot name of ``*`` can be used to indicate a single
220 description applying to all slots (no other ``<slot/>`` elements may be used
221 in this case).
222
223 - at most one ``<subslots/>`` element describing the role of subslots (all
224 of them) as text.
225
226 USE flag descriptions
227 ~~~~~~~~~~~~~~~~~~~~~
228 Each ``<use/>`` element describes USE flags of a package (in specific
229 language).
230
231 The ``<use/>`` element can contain the following elements:
232
233 - zero or more ``<flag/>`` elements describing specific USE flags, optionally
234 restricted to specific package versions (at most one entry for a combination
235 of USE flag name and package version). The ``<flag/>`` element contains
236 an obligatory ``name=""`` attribute stating the name of the USE flag to
237 which the description applies, and contains text, optionally interspersed
238 with ``<cat/>`` and ``<pkg/>`` elements.
239
240 Upstream descriptions
241 ~~~~~~~~~~~~~~~~~~~~~
242 The ``<upstream/>`` element provides information on the upstream of a package.
243 It contains the following elements:
244
245 - zero or more ``<maintainer/>`` elements listing package's upstream
246 maintainers, as described in `Upstream maintainer descriptions`_,
247
248 - at most one ``<changelog/>`` element containing URL to an on-line copy
249 of upstream changelog,
250
251 - zero or more ``<doc/>`` elements containing URLs to on-line copies
252 of upstream documentation in different languages (at most one for each
253 language),
254
255 - at most one ``<bugs-to/>`` element containing upstream bug reporting URL,
256 that can optionally be a ``mailto:`` URL,
257
258 - zero or more ``<remote-id/>`` elements listing package identities on package
259 identification trackers. Each of those elements has an obligatory
260 ``type=""`` attribute that matches a pre-defined name of package
261 identification tracker, and a value that is an identifier specific to
262 the tracker. The list of available trackers and their specific identifiers
263 are outside scope of this specification.
264
265 Upstream maintainer descriptions
266 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
267 Each ``<maintainer/>`` element inside ``<upstream/>`` describes a single
268 upstream maintainer.
269
270 The ``<maintainer/>`` element has an optional ``status=""`` attribute whose
271 value can be either ``active`` or ``inactive``. If not specified, an implicit
272 ``unknown`` value is assumed.
273
274 The ``<maintainer/>`` element has the following attributes, in any order:
275
276 - at most one ``<email/>`` element that contains the maintainer's e-mail
277 address,
278
279 - exactly one ``<name/>`` element that contains the maintainer's
280 human-readable name (real name or nickname).
281
282
283 Rationale
284 =========
285
286 Information sources
287 -------------------
288
289 The basic source of information on current metadata.xml format was
290 ``metadata.dtd`` as of 2016-03-02 [#ORIGINAL-METADATA-XML]_. Whenever the DTD
291 was unclear, appropriate GLEPs were referenced in order to deduce the original
292 intent. Whenever the GLEPs were unclear or the elements missed GLEPs, original
293 mailing list discussions were referenced.
294
295 Removed elements
296 ----------------
297
298 Compared to the original DTD, the following elements were removed (both
299 in the spec and in the updated DTD file):
300
301 - package-scope ``<changelog/>`` element was removed. It dates back to the
302 original metadata.xml proposal [#ORIGINAL-METADATA-XML]_ but it was never
303 implemented — instead, plain text ChangeLogs were used. Furthermore,
304 GLEP 46 introduced ``<changelog/>`` inside ``<upstream/>`` with
305 different type which collided with the global declaration due to DTD
306 limitations.
307
308 - package-scope ``<natural-name/>`` element was removed. It was available for
309 1.5yr and after that time, it reached four packages providing it and no
310 known tool supporting/using it. It was used only to provide a copy of
311 package name with correct case (e.g. libressl -> LibreSSL), therefore
312 the information provided by it was considered redundant.
313
314 - top-level ``<packages/>`` variant was removed. It was never used and it was
315 really unclear what its use would be. In any case, this made the DTD
316 simpler.
317
318 <pkg/> value format
319 -------------------
320
321 A debate on valid format of ``<pkg/>`` element values preceded the writing of
322 this GLEP. The DTD did not specify a value format restriction on this, only
323 suggested that it is used *for cross-linking*. Further on, GLEP 56 redefined
324 its value to *a valid CP or CPV*. The practical uses did not include
325 the latter case; however, it was common to include EAPI 1 slot specifiers or
326 even EAPI 5 slot operators following the qualified package names.
327
328 After finding the Doug Goldstein's blog post on introduction of <pkg/>
329 elements [#USE-FLAG-METADATA]_, it turned out that the original intent was to
330 *allow cross-linking/referencing from packages.gentoo.org*. Since the latter
331 uses qualified package names as identifiers, it was decided to restrict
332 ``<pkg/>`` elements to reference those. For entries that include slot
333 specifiers, it is recommended to move the slot specifiers out of ``<pkg/>``
334 element.
335
336 Language identifiers
337 --------------------
338
339 Originally, the DTD used implicit default value of ``C``. However, this value
340 was not in line with real language specifiers found in ``metadata.xml``.
341 The latter usually took form of ISO 639-1 language codes which do not form
342 a valid (complete) locale identifiers, while the former is not a valid
343 language identifier in any of the considered standards. Furthermore, since
344 ``en`` was commonly used to identify English in metadata.xml files,
345 and no tools relied on the implicit default defined in the DTD, it was decided
346 to change the implicit default to ``en``.
347
348 Language identifiers were later updated to allow full IETF language tags,
349 so that codes like ``pt-BR`` or ``zh-Hant`` can be represented.
350
351 Package restrictions
352 --------------------
353
354 Originally, the DTD described the ``restrict=""`` attribute as: *the format
355 of this attribute is equal to the format of DEPEND lines in ebuilds.* This
356 specification is based upon this definition. However, for practical reasons it
357 added three clarifications to it:
358
359 - only package dependency specifications are allowed (i.e. no USE-conditionals
360 or multiple dependency specifications),
361
362 - EAPI 5 dependency specifications are allowed. Although ``metadata.xml``
363 provides no EAPI identification mechanism, the top-level profile directory
364 specifies EAPI 5, and Portage supports EAPI 5 since 2012.
365
366 - only dependencies referencing the same package are allowed.
367
368 Furthermore, DTD added a special case for ``*`` value that *applies if there
369 are no other tags that apply*. This behavior was not used at all, and being
370 at least a bit confusing (compared to the common use of ``*`` to imply
371 matching everything), it was removed.
372
373 Upstream block
374 --------------
375
376 The upstream block was defined by GLEP 46. However, this GLEP is ambiguous
377 at the best. Tiziano Müller (one of the original authors) has explained
378 the intent behind most of the elements of the GLEP.
379
380 In particular, he confirmed that the GLEP lists all elements that are allowed
381 explicitly, and no implicit inclusions were meant to be allowed. This means
382 that the ``<maintainer/>`` element does not allow a ``<description/>``.
383
384 He also confirmed that unless noted otherwise, elements were not allowed to
385 be used more than once. This affects ``<bugs-to/>`` and ``<changelog/>``
386 elements. Repetitions of ``<doc/>`` were only allowed because DTD technically
387 didn't permit restricting them while allowing uses of different languages.
388
389 At the time of writing this GLEP, only a single Gentoo package was using
390 multiple ``<bugs-to/>`` elements, and no packages were using multiple
391 ``<changelog/>`` or ``<doc/>`` elements (or non-English docs). For this
392 reason, this GLEP enforces the original intent of *at most one* element.
393
394 Rationale for upstream maintainer descriptions
395 ----------------------------------------------
396
397 The proper contents of the ``<maintainer/>`` elements in ``<upstream/>``
398 blocks were unclear in the DTD since the technical file format limitation
399 implied that all elements and attributes added for the Gentoo maintainers
400 also applied to upstream maintainers, and vice versa.
401
402 The comments in the DTD clearly separated attributes between the two —
403 i.e. stated that the ``type`` attribute is used only for Gentoo maintainers,
404 while the ``status`` attribute is used only for upstream maintainers. However,
405 package version restrictions and maintainer descriptions were also implicitly
406 allowed on them. Since neither of the two was allowed by GLEP 46, this
407 specification disallows them.
408
409
410 Backwards Compatibility
411 =======================
412
413 This specification does not introduce any new elements or attributes compared
414 to the current DTD. Therefore, all ``metadata.xml`` files created in its
415 compliance will be read correctly by the existing tools and will conform
416 to the current DTD.
417
418 However, this specification is more strict than the rules enforced by the DTD.
419 Therefore, not all existing ``metadata.xml`` will be conforming to the spec,
420 even though they would be correct according to the DTD. New tools will
421 consider the files incorrect and request developers to fix them.
422
423
424 Reference implementation
425 ========================
426
427 Parsing metadata.xml
428 --------------------
429
430 Since the metadata.xml format provided by this specification is compatible
431 with existing tool, no new implementation is required for reading those files.
432
433 Checking metadata.xml validity
434 ------------------------------
435
436 To provide more strict checking of metadata.xml files, XML schema file is
437 provided in the Gentoo xml-schema repository [#XML-SCHEMA]_. This schema
438 provides:
439
440 - element structure checks,
441
442 - data duplication checks (e.g. multiple descriptions for the same flag
443 but see below),
444
445 - partial value correctness checks.
446
447 The limitations of the schema are:
448
449 - values are verified using simple regular expressions, so not all format
450 violations will be caught (e.g. the rule will consider ``app-foo/bar-1``
451 a valid qualified package name when the version suffix is disallowed),
452
453 - cross-references can not be checked (package references, category
454 references, URLs, project identifiers),
455
456 - ``<maintainer type=""/>`` correctness can not be checked,
457
458 - data duplication checks are done per ``restrict=""`` value rather than
459 per every package version matched by the restriction. Therefore, multiple
460 definitions that are applied to a single package by two different
461 ``restrict=""`` rules will not be caught.
462
463 Example metadata.xml file
464 -------------------------
465
466 .. code:: xml
467
468 <?xml version='1.0' encoding='UTF-8'?>
469 <pkgmetadata>
470 <maintainer type='person'>
471 <email>developer@×××××××.com</email>
472 <name>Example Developer</name>
473 </maintainer>
474 <maintainer type='person' restrict='dev-libs/foo:11'>
475 <email>anotherdev@×××××××.com</email>
476 <name>Another Developer</name>
477 <description>CC only on bugs for libfoo.so.11</description>
478 </maintainer>
479 <maintainer type='project'>
480 <email>project@×××××××.com</email>
481 <name>Example Project</name>
482 </maintainer>
483 <maintainer type='person'>
484 <email>upstream@×××××××.com</email>
485 <name>Upstream Developer</name>
486 <description>Upstream developer, wishing to be CC-ed on bugs</description>
487 </maintainer>
488 <longdescription>
489 First paragraph of extensive description.
490
491 Second paragraph.
492 </longdescription>
493 <longdescription lang='de'>
494 Erster Absatz mit detaillierter Beschreibung.
495
496 Zweiter Absatz.
497 </longdescription>
498 <slots>
499 <slot name='11'>Compatibility slot providing libfoo.so.11 only.</slot>
500 <subslots>
501 Match SONAME of libfoo.so.
502 </subslots>
503 </slots>
504 <slots lang='de'>
505 <slot name='11'>Kompatibilitäts-Slot, installiert ausschließlich libfoo.so.11.</slot>
506 <subslots>
507 Subslot ist stets identisch mit dem SONAME von libfoo.so.
508 </subslots>
509 </slots>
510 <use>
511 <flag name='foo'>Enables foo feature</flag>
512 <flag name='bar' restrict='&lt;dev-libs/foo-12'>Enables bar feature (requires <pkg>dev-libs/bar</pkg>)</flag>
513 <flag name='bar' restrict='&gt;=dev-libs/foo-12'>Enables bar feature</flag>
514 </use>
515 <use lang='de'>
516 <flag name='foo'>Konfiguriert das Paket mit Unterstützung für foo</flag>
517 <flag name='bar' restrict='&lt;dev-libs/foo-12'>Konfiguriert das Paket mit Unterstützung für bar (benötigt <pkg>dev-libs/bar</pkg>)</flag>
518 <flag name='bar' restrict='&gt;=dev-libs/foo-12'>Konfiguriert das Paket mit Unterstützung für bar</flag>
519 </use>
520 <upstream>
521 <maintainer status='active'>
522 <email>upstream@×××××××.com</email>
523 <name>Upstream Developer</name>
524 </maintainer>
525 <maintainer status='inactive'>
526 <!-- e-mail unknown -->
527 <name>John Smith</name>
528 </maintainer>
529 <changelog>http://www.example.com/releases.html</changelog>
530 <doc>http://www.example.com/doc.html</doc>
531 <doc lang='de'>http://www.example.com/doc.de.html</doc>
532 <bugs-to>http://www.example.com/issues.html</bugs-to>
533 <remote-id type='foohub'>example/foo</remote-id>
534 </upstream>
535 </pkgmetadata>
536
537 German translations provided by tamiko.
538
539
540 References
541 ==========
542
543 .. [#PMS-A] PMS Appendix A
544 https://projects.gentoo.org/pms/5/pms.html#x1-163000A
545
546 .. [#METADATA-DTD] The original metadata.dtd file
547 https://gitweb.gentoo.org/data/dtd.git/tree/metadata.dtd?id=a908a93b5afe295359e0a01814c9bef8b5268bcd
548
549 .. [#XML10] Extensible Markup Language (XML) 1.0 (Fifth Edition)
550 https://www.w3.org/TR/xml/
551
552 .. [#BCP-47] BCP 47: "Tags for identifying languages",
553 https://tools.ietf.org/rfc/bcp/bcp47.txt
554
555 .. [#ORIGINAL-METADATA-XML] The original metadata.xml proposal:
556 Paul de Vrieze. "IMPORTANT: The proposal for the metadata.xml file".
557 gentoo-dev mailing list, 2003-06-27,
558 Message-ID 200306272248.38169.pauldv\@gentoo.org,
559 https://archives.gentoo.org/gentoo-dev/message/cbcc15e9906c0165976ad66d4343ba7a
560
561 .. [#USE-FLAG-METADATA] Doug Goldstein: USE flag metadata
562 https://cardoe.wordpress.com/2007/11/19/use-flag-metadata/
563
564 .. [#XML-SCHEMA] Gentoo XML schema
565 https://gitweb.gentoo.org/data/xml-schema.git/
566
567
568 Copyright
569 =========
570
571 This work is licensed under the Creative Commons Attribution-ShareAlike 4.0
572 International License. To view a copy of this license, visit
573 https://creativecommons.org/licenses/by-sa/4.0/.

Replies