Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [GLEP78][re-post] Updating specification r3
Date: Thu, 14 Jul 2022 10:10:30
Message-Id: 0090359357a0172ffacdd83e676f6f0147c72182.camel@gentoo.org
In Reply to: [gentoo-dev] [GLEP78][re-post] Updating specification r3 by Sheng Yu
1 I am truly sorry for taking this long to reply.
2
3 Overall, this is amazing work. Big +1 from me. I have just a few
4 editorial suggestions — I'm noting them here for completeness, I'll
5 apply them myself in a minute.
6
7
8 On Sat, 2022-05-28 at 19:17 +0000, Sheng Yu wrote:
9 > From ee52f60557d72d6274610d461eec1d28453a464f Mon Sep 17 00:00:00 2001
10 > From: Sheng Yu <syu.os@××××××××××.com>
11 > Date: Sat, 28 May 2022 15:06:46 -0400
12 > Subject: [PATCH] GLEP 78 draft update
13 >
14 > Signed-off-by: Sheng Yu <syu.os@××××××××××.com>
15 > ---
16 > glep-0078.rst | 114 ++++++++++++++++++++++++++++++++++++++++++--------
17 > 1 file changed, 96 insertions(+), 18 deletions(-)
18 >
19 > diff --git a/glep-0078.rst b/glep-0078.rst
20 > index 1f7cd9b..82c74c8 100644
21 > --- a/glep-0078.rst
22 > +++ b/glep-0078.rst
23 > @@ -2,12 +2,13 @@
24 > GLEP: 78
25 > Title: Gentoo binary package container format
26 > Author: Michał Górny <mgorny@g.o>
27 > + Sheng Yu <syu.os@××××××××××.com>
28 > Type: Standards Track
29 > Status: Draft
30 > Version: 1
31 > Created: 2018-11-15
32 > -Last-Modified: 2019-07-29
33 > -Post-History: 2018-11-17, 2019-07-08
34 > +Last-Modified: 2021-10-10
35 > +Post-History: 2018-11-17, 2019-07-08, 2021-09-13, 2021-09-22, 2022-05-28
36 > Content-Type: text/x-rst
37 > ---
38 >
39 > @@ -154,10 +155,15 @@ The following obligatory goals have been set for a replacement format:
40 > enough to let user inspect and manipulate it without special tooling
41 > or detailed knowledge.
42 >
43 > -3. **The file format must provide support for OpenPGP signatures.**
44 > +3. **The file format must be able to detect its own data corruption.**
45 > + In particular, it needs to contain the checksum of its own data for
46 > + package manager to be able to verify its integrity without relying
47 > + on additional files.
48 > +
49 > +4. **The file format must provide support for OpenPGP signatures.**
50 > Preferably, it should use standard OpenPGP message formats.
51 >
52 > -4. **The file format must allow for efficient metadata updates.**
53 > +5. **The file format must allow for efficient metadata updates.**
54 > In particular, it should be possible to update the metadata without
55 > having to recompress package files.
56 >
57 > @@ -186,35 +192,39 @@ The container format
58 > The gpkg package container is an uncompressed .tar achive whose filename
59 > should use ``.gpkg.tar`` suffix.
60 >
61 > -The archive contains a number of files, stored in a single directory
62 > -whose name should match the basename of the package file. However,
63 > -the implementation must be able to process an archive where
64 > -the directory name is mismatched. There should be no explicit archive
65 > -member entry for the directory.
66 > +The archive contains a number of files. All package-related files
67 > +should be stored in a single directory whose name matches the basename
68 > +of the package file. However, the implementation must be able to
69 > +process an archive where the directory name is mismatched. There should
70 > +be no explicit archive member entry for the directory.
71 >
72 > The package directory contains the following members, in order:
73 >
74 > 1. The package format identifier file ``gpkg-1`` (required).
75 >
76 > -2. A signature for the metadata archive: ``metadata.tar${comp}.sig``
77 > +2. The metadata archive ``metadata.tar${comp}``, optionally compressed
78 > + (required).
79 > +
80 > +3. A signature for the metadata archive: ``metadata.tar${comp}.sig``
81 > (optional).
82 >
83 > -3. The metadata archive ``metadata.tar${comp}``, optionally compressed
84 > - (required).
85 > +4. The filesystem image archive ``image.tar${comp}``, optionally
86 > + compressed (required).
87 >
88 > -4. A signature for the filesystem image archive:
89 > +5. A signature for the filesystem image archive:
90 > ``image.tar${comp}.sig`` (optional).
91 >
92 > -5. The filesystem image archive ``image.tar${comp}``, optionally
93 > - compressed (required).
94 > +6. The package Manifest data file ``Manifest``, optionally clear-text
95 > + signed (required)
96
97 Editorial: full stop is missing here.
98
99 >
100 > It is recommended that relative order of the archive members is
101 > preserved. However, implementations must support archives with members
102 > out of order.
103 >
104 > The container may be extended with additional members in the future.
105 > -The implementations should ignore unrecognized members and preserve
106 > -them across package updates.
107 > +If the Manifest is present, all files contained in the archive must
108 > +be listed in it and verify successfully. The package manager should
109 > +ignore unknown files but preserve them across package updates.
110 >
111 >
112 > Permitted .tar format features
113 > @@ -301,10 +311,29 @@ suffixed using the standard suffix for the particular compressed file
114 > type (e.g. ``.bz2`` for bzip2 format).
115 >
116 >
117 > +The package Manifest file
118 > +-------------------------
119 > +
120 > +The Manifest file must include digests of all files in the binary
121 > +package container, except for itself. The purpose of this file is
122 > +to provide the package manager with an ability to detect corruption
123 > +or alteration of the binary package before attempting to read the
124 > +inner archive contents. This file also provides protection against
125 > +signature reuse/replacement attacks if the OpenPGP signatures are used.
126 > +
127 > +The implementation follows the Manifest specifications in GLEP 74
128 > +[#GLEP74]_ and uses the DATA tag for files within the container.
129 > +
130 > +The implementation should be able to detect checksum mismatches,
131 > +as well as missing, duplicate, or extraneous files within the
132
133 Editorial: don't leave 'the' at the end of the line.
134
135 > +container. In the case of verification failure, no subsequent
136 > +operations on the archive should be performed.
137 > +
138 > +
139 > OpenPGP member signatures
140 > -------------------------
141 >
142 > -The archive members support optional OpenPGP signatures.
143 > +The archive members and Manifest support optional OpenPGP signatures.
144 > The implementations must allow the user to specify whether OpenPGP
145 > signatures are to be expected in remotely fetched packages.
146 >
147 > @@ -490,6 +519,38 @@ Debian has a similar guideline for the inner tar of their package
148 > format [#DEB-FORMAT]_.
149 >
150 >
151 > +.tar security issues
152 > +--------------------
153 > +
154 > +Some of the original features of .tar are obsolete with the modern
155 > +usage.
156 > +
157 > +Firstly, .tar permits duplicate files to exist [#TARDUP]_. The
158
159 Same.
160
161 > +later duplicate files overwrite the previously extracted files when
162 > +extracting all files in order. This is useful for incremental
163 > +backups. However, a general-purpose archiving tools may choose
164 > +arbitrary files matching a path name, leading to checksum or
165 > +signature bypass. To prevent this, duplicate files are forbidden
166 > +from existing.
167 > +
168 > +Secondly, .tar lacks integrity checks, except for the header
169 > +self-check. Data corruption can usually be detected through
170 > +integrity checks in the additional compression layer. However,
171 > +this does not provide a way of verifying the integrity of the
172
173 Here too.
174
175 > +compressed data in advance. For this reason, an additional
176 > +Manifest file is included that provides checksums for other
177 > +files in the archive. A corrupted Manifest invalidates the whole
178 > +package.
179 > +
180 > +Thirdly, many .tar implementations have various security problems,
181 > +including the Python tarfile module [#ISSUE21109]_. They provide
182 > +multiple attack vectors, e.g. permitting overwriting files outside the
183 > +destination directory using special filenames, symlinks, hard links or
184
185 Here 'the' and 'or'.
186
187 > +device files. For this purpose, only regular files are permitted inside
188 > +the container. It is recommended to process the container data in place
189 > +rather than extracting it.
190 > +
191 > +
192 > Member ordering
193 > ---------------
194 >
195 > @@ -511,6 +572,14 @@ them. Covering the compressed archives helps to prevent zipbomb
196 > attacks. Covering the individual members rather than the whole package
197 > provides for verification of partially fetched binary packages.
198 >
199 > +However, signing individual files does not guarantee that all members
200 > +are originating from the same binary package. This opens up the
201
202 Here too.
203
204 > +possibility of a replacement/reuse attack, e.g. combining the signed
205 > +metadata from foo-1.1 with signed image from foo-1.0. The new binary
206 > +package passes the signature check. To prevent this type of attack,
207 > +we need the additional Menifest file and its signature to verify the
208
209 ...and here.
210
211 > +authenticity of the complete binary package.
212 > +
213 >
214 > Format versioning
215 > -----------------
216 > @@ -564,10 +633,19 @@ References
217 > .. [#TAR-PORTABILITY] Michał Górny, Portability of tar features
218 > (https://dev.gentoo.org/~mgorny/articles/portability-of-tar-features.html)
219 >
220 > +.. [#GLEP74] GLEP 74: Full-tree verification using Manifest files
221 > + (https://www.gentoo.org/glep/glep-0074.html)
222 > +
223 > .. [#XPAK2GPKG] xpak2gpkg: Proof-of-concept converter from tbz2/xpak
224 > to gpkg binpkg format
225 > (https://github.com/mgorny/xpak2gpkg)
226 >
227 > +.. [#TARDUP] tar: Multiple Members with the Same Name
228 > + (https://www.gnu.org/software/tar/manual/html_node/multiple.html)
229 > +
230 > +.. [#ISSUE21109] Python tarfile: Traversal attack vulnerability
231 > + (https://bugs.python.org/issue21109)
232 > +
233 >
234 > Copyright
235 > =========
236 > --
237 > 2.35.1
238
239 --
240 Best regards,
241 Michał Górny