Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev@l.g.o
Cc: "Michał Górny" <mgorny@g.o>
Subject: [gentoo-dev] [PATCH v3 3/3] glep-0074: Specify compressed file formats
Date: Sun, 18 Sep 2022 18:32:46
Message-Id: 20220918183139.1534979-4-mgorny@gentoo.org
In Reply to: [gentoo-dev] [PATCH v3 0/3] glep-0074: Explicitly specify hashes and compressed Manifest formats by "Michał Górny"
1 Signed-off-by: Michał Górny <mgorny@g.o>
2 ---
3 glep-0074.rst | 81 ++++++++++++++++++++++++++++++++++++++++++++-------
4 1 file changed, 71 insertions(+), 10 deletions(-)
5
6 diff --git a/glep-0074.rst b/glep-0074.rst
7 index 5a63f70..d96a3dd 100644
8 --- a/glep-0074.rst
9 +++ b/glep-0074.rst
10 @@ -27,7 +27,8 @@ Changes
11 =======
12
13 v1.3
14 - Formally specified the current set of hash algorithms supported.
15 + Formally specified the current set of hash algorithms and compressed
16 + Manifest formats supported.
17
18 v1.2
19 Specified the newline convention used for Manifests.
20 @@ -432,9 +433,8 @@ compression and this specification.
21
22 The compressed Manifest files are required to be suffixed for their
23 compression algorithm. This suffix should be used to recognize
24 -the compression and decompress Manifests transparently. The exact list
25 -of algorithms and their corresponding suffixes are outside the scope
26 -of this specification.
27 +the compression and decompress Manifests transparently. The supported
28 +formats are specified in `compressed file formats`_ section.
29
30 The top-level Manifest file must not be compressed. Since the OpenPGP
31 signature covers the uncompressed text and is compressed itself,
32 @@ -455,6 +455,46 @@ uncompressed content and the specification is free to choose either
33 of the files using the same base name.
34
35
36 +Compressed file formats
37 +-----------------------
38 +
39 +.. table:: Table 2. Defined compressed file formats
40 + :widths: auto
41 +
42 + =========== ====== ==================== ===========
43 + Tool name Suffix Specification Notes
44 + =========== ====== ==================== ===========
45 + bzip2 .bz2 (none known)
46 + gzip .gz RFC 1952 [#RFC1952]_ Recommended
47 + lz4 .lz4 (none known)
48 + lzip .lz RFC draft [#LZIP]_
49 + lzma .lzma (none known) Deprecated
50 + lzop .lzo (none known)
51 + xz .xz xz [#XZ]_
52 + zstd .zst RFC 8878 [#RFC8878]_
53 + =========== ====== ==================== ===========
54 +
55 +Any new formats must be added to this specification prior to being used
56 +for Manifest files. Adding a new compressed file format is considered
57 +a backwards-compatible change to the GLEP. It is recommended that new
58 +formats use their reference (most common) file suffixes.
59 +
60 +An implementation can implement an arbitrary subset of the listed
61 +formats. For best interoperability, it should implement at least
62 +the recommended formats. Using deprecated formats should be avoided.
63 +
64 +If multiple Manifest variants coexist using different compressed file
65 +formats, the implementation may choose to use an arbitrary subset
66 +of them. However, all of them must be verified against the hashes stored
67 +in the containing Manifest. Should they be decompressed, the resulting
68 +contents must be identical.
69 +
70 +If the compressed file format is unsupported and a variant using
71 +a supported format coexists, the other variant should be used. However,
72 +at least one supported variant must exist for the verification
73 +to succeed.
74 +
75 +
76 Combining multiple Manifest trees (informational)
77 -------------------------------------------------
78
79 @@ -1033,12 +1073,19 @@ into a compressed sub-Manifest in the top directory (e.g.
80 ``Manifest.sub.gz``), and including a ``MANIFEST`` entry for this file
81 in a signed, uncompressed top-level Manifest.
82
83 -The existence of additional entries for uncompressed Manifest checksums
84 -was debated. However, plain entries for the uncompressed file would
85 -be confusing if only the compressed file existed, and conflicting
86 -if both uncompressed and compressed variants existed. Furthermore,
87 -it has been pointed out that ``DIST`` entries do not have
88 -an uncompressed variant either.
89 +The existence of additional entries for checksums of Manifest contents
90 +after uncompressing was debated. However, plain entries for
91 +the uncompressed file would be confusing if only the compressed file
92 +existed. Furthermore, it has been pointed out that ``DIST`` entries
93 +do not have an uncompressed variant either.
94 +
95 +The specification permits coexistence of multiple variants of the same
96 +Manifest file using different compression for historical compatibility.
97 +However, there does not seem to be any real benefit from including
98 +a compressed Manifest file if the uncompressed variant needs to exist
99 +anyway. Providing different compressed variants could technically
100 +improve interoperability, though the same result could probably
101 +be achieved by using a more commonly supported format (e.g. gzip).
102
103
104 Performance considerations
105 @@ -1171,6 +1218,20 @@ References
106 (archived at 2017-11-29)
107 (https://web.archive.org/web/20171129084214/http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html)
108
109 +.. [#RFC1952] RFC 1952: GZIP file format specification version 4.3
110 + (https://www.rfc-editor.org/rfc/rfc1952)
111 +
112 +.. [#LZIP] RFC draft: Lzip Compressed Format and the 'application/lzip'
113 + Media Type
114 + (https://datatracker.ietf.org/doc/html/draft-diaz-lzip)
115 +
116 +.. [#XZ] The .xz File Format
117 + (https://tukaani.org/xz/xz-file-format.txt)
118 +
119 +.. [#RFC8878] RFC 8878: Zstandard Compression and the 'application/zstd'
120 + Media Type
121 + (https://www.rfc-editor.org/rfc/rfc8878)
122 +
123 .. [#C08] Cappos, J et al. (2008). "Attacks on Package Managers"
124 (https://www2.cs.arizona.edu/stork/packagemanagersecurity/attacks-on-package-managers.html)
125
126 --
127 2.37.3