1 |
I am truly sorry for taking this long to reply. |
2 |
|
3 |
Overall, this is amazing work. Big +1 from me. I have just a few |
4 |
editorial suggestions — I'm noting them here for completeness, I'll |
5 |
apply them myself in a minute. |
6 |
|
7 |
|
8 |
On Sat, 2022-05-28 at 19:17 +0000, Sheng Yu wrote: |
9 |
> From ee52f60557d72d6274610d461eec1d28453a464f Mon Sep 17 00:00:00 2001 |
10 |
> From: Sheng Yu <syu.os@××××××××××.com> |
11 |
> Date: Sat, 28 May 2022 15:06:46 -0400 |
12 |
> Subject: [PATCH] GLEP 78 draft update |
13 |
> |
14 |
> Signed-off-by: Sheng Yu <syu.os@××××××××××.com> |
15 |
> --- |
16 |
> glep-0078.rst | 114 ++++++++++++++++++++++++++++++++++++++++++-------- |
17 |
> 1 file changed, 96 insertions(+), 18 deletions(-) |
18 |
> |
19 |
> diff --git a/glep-0078.rst b/glep-0078.rst |
20 |
> index 1f7cd9b..82c74c8 100644 |
21 |
> --- a/glep-0078.rst |
22 |
> +++ b/glep-0078.rst |
23 |
> @@ -2,12 +2,13 @@ |
24 |
> GLEP: 78 |
25 |
> Title: Gentoo binary package container format |
26 |
> Author: Michał Górny <mgorny@g.o> |
27 |
> + Sheng Yu <syu.os@××××××××××.com> |
28 |
> Type: Standards Track |
29 |
> Status: Draft |
30 |
> Version: 1 |
31 |
> Created: 2018-11-15 |
32 |
> -Last-Modified: 2019-07-29 |
33 |
> -Post-History: 2018-11-17, 2019-07-08 |
34 |
> +Last-Modified: 2021-10-10 |
35 |
> +Post-History: 2018-11-17, 2019-07-08, 2021-09-13, 2021-09-22, 2022-05-28 |
36 |
> Content-Type: text/x-rst |
37 |
> --- |
38 |
> |
39 |
> @@ -154,10 +155,15 @@ The following obligatory goals have been set for a replacement format: |
40 |
> enough to let user inspect and manipulate it without special tooling |
41 |
> or detailed knowledge. |
42 |
> |
43 |
> -3. **The file format must provide support for OpenPGP signatures.** |
44 |
> +3. **The file format must be able to detect its own data corruption.** |
45 |
> + In particular, it needs to contain the checksum of its own data for |
46 |
> + package manager to be able to verify its integrity without relying |
47 |
> + on additional files. |
48 |
> + |
49 |
> +4. **The file format must provide support for OpenPGP signatures.** |
50 |
> Preferably, it should use standard OpenPGP message formats. |
51 |
> |
52 |
> -4. **The file format must allow for efficient metadata updates.** |
53 |
> +5. **The file format must allow for efficient metadata updates.** |
54 |
> In particular, it should be possible to update the metadata without |
55 |
> having to recompress package files. |
56 |
> |
57 |
> @@ -186,35 +192,39 @@ The container format |
58 |
> The gpkg package container is an uncompressed .tar achive whose filename |
59 |
> should use ``.gpkg.tar`` suffix. |
60 |
> |
61 |
> -The archive contains a number of files, stored in a single directory |
62 |
> -whose name should match the basename of the package file. However, |
63 |
> -the implementation must be able to process an archive where |
64 |
> -the directory name is mismatched. There should be no explicit archive |
65 |
> -member entry for the directory. |
66 |
> +The archive contains a number of files. All package-related files |
67 |
> +should be stored in a single directory whose name matches the basename |
68 |
> +of the package file. However, the implementation must be able to |
69 |
> +process an archive where the directory name is mismatched. There should |
70 |
> +be no explicit archive member entry for the directory. |
71 |
> |
72 |
> The package directory contains the following members, in order: |
73 |
> |
74 |
> 1. The package format identifier file ``gpkg-1`` (required). |
75 |
> |
76 |
> -2. A signature for the metadata archive: ``metadata.tar${comp}.sig`` |
77 |
> +2. The metadata archive ``metadata.tar${comp}``, optionally compressed |
78 |
> + (required). |
79 |
> + |
80 |
> +3. A signature for the metadata archive: ``metadata.tar${comp}.sig`` |
81 |
> (optional). |
82 |
> |
83 |
> -3. The metadata archive ``metadata.tar${comp}``, optionally compressed |
84 |
> - (required). |
85 |
> +4. The filesystem image archive ``image.tar${comp}``, optionally |
86 |
> + compressed (required). |
87 |
> |
88 |
> -4. A signature for the filesystem image archive: |
89 |
> +5. A signature for the filesystem image archive: |
90 |
> ``image.tar${comp}.sig`` (optional). |
91 |
> |
92 |
> -5. The filesystem image archive ``image.tar${comp}``, optionally |
93 |
> - compressed (required). |
94 |
> +6. The package Manifest data file ``Manifest``, optionally clear-text |
95 |
> + signed (required) |
96 |
|
97 |
Editorial: full stop is missing here. |
98 |
|
99 |
> |
100 |
> It is recommended that relative order of the archive members is |
101 |
> preserved. However, implementations must support archives with members |
102 |
> out of order. |
103 |
> |
104 |
> The container may be extended with additional members in the future. |
105 |
> -The implementations should ignore unrecognized members and preserve |
106 |
> -them across package updates. |
107 |
> +If the Manifest is present, all files contained in the archive must |
108 |
> +be listed in it and verify successfully. The package manager should |
109 |
> +ignore unknown files but preserve them across package updates. |
110 |
> |
111 |
> |
112 |
> Permitted .tar format features |
113 |
> @@ -301,10 +311,29 @@ suffixed using the standard suffix for the particular compressed file |
114 |
> type (e.g. ``.bz2`` for bzip2 format). |
115 |
> |
116 |
> |
117 |
> +The package Manifest file |
118 |
> +------------------------- |
119 |
> + |
120 |
> +The Manifest file must include digests of all files in the binary |
121 |
> +package container, except for itself. The purpose of this file is |
122 |
> +to provide the package manager with an ability to detect corruption |
123 |
> +or alteration of the binary package before attempting to read the |
124 |
> +inner archive contents. This file also provides protection against |
125 |
> +signature reuse/replacement attacks if the OpenPGP signatures are used. |
126 |
> + |
127 |
> +The implementation follows the Manifest specifications in GLEP 74 |
128 |
> +[#GLEP74]_ and uses the DATA tag for files within the container. |
129 |
> + |
130 |
> +The implementation should be able to detect checksum mismatches, |
131 |
> +as well as missing, duplicate, or extraneous files within the |
132 |
|
133 |
Editorial: don't leave 'the' at the end of the line. |
134 |
|
135 |
> +container. In the case of verification failure, no subsequent |
136 |
> +operations on the archive should be performed. |
137 |
> + |
138 |
> + |
139 |
> OpenPGP member signatures |
140 |
> ------------------------- |
141 |
> |
142 |
> -The archive members support optional OpenPGP signatures. |
143 |
> +The archive members and Manifest support optional OpenPGP signatures. |
144 |
> The implementations must allow the user to specify whether OpenPGP |
145 |
> signatures are to be expected in remotely fetched packages. |
146 |
> |
147 |
> @@ -490,6 +519,38 @@ Debian has a similar guideline for the inner tar of their package |
148 |
> format [#DEB-FORMAT]_. |
149 |
> |
150 |
> |
151 |
> +.tar security issues |
152 |
> +-------------------- |
153 |
> + |
154 |
> +Some of the original features of .tar are obsolete with the modern |
155 |
> +usage. |
156 |
> + |
157 |
> +Firstly, .tar permits duplicate files to exist [#TARDUP]_. The |
158 |
|
159 |
Same. |
160 |
|
161 |
> +later duplicate files overwrite the previously extracted files when |
162 |
> +extracting all files in order. This is useful for incremental |
163 |
> +backups. However, a general-purpose archiving tools may choose |
164 |
> +arbitrary files matching a path name, leading to checksum or |
165 |
> +signature bypass. To prevent this, duplicate files are forbidden |
166 |
> +from existing. |
167 |
> + |
168 |
> +Secondly, .tar lacks integrity checks, except for the header |
169 |
> +self-check. Data corruption can usually be detected through |
170 |
> +integrity checks in the additional compression layer. However, |
171 |
> +this does not provide a way of verifying the integrity of the |
172 |
|
173 |
Here too. |
174 |
|
175 |
> +compressed data in advance. For this reason, an additional |
176 |
> +Manifest file is included that provides checksums for other |
177 |
> +files in the archive. A corrupted Manifest invalidates the whole |
178 |
> +package. |
179 |
> + |
180 |
> +Thirdly, many .tar implementations have various security problems, |
181 |
> +including the Python tarfile module [#ISSUE21109]_. They provide |
182 |
> +multiple attack vectors, e.g. permitting overwriting files outside the |
183 |
> +destination directory using special filenames, symlinks, hard links or |
184 |
|
185 |
Here 'the' and 'or'. |
186 |
|
187 |
> +device files. For this purpose, only regular files are permitted inside |
188 |
> +the container. It is recommended to process the container data in place |
189 |
> +rather than extracting it. |
190 |
> + |
191 |
> + |
192 |
> Member ordering |
193 |
> --------------- |
194 |
> |
195 |
> @@ -511,6 +572,14 @@ them. Covering the compressed archives helps to prevent zipbomb |
196 |
> attacks. Covering the individual members rather than the whole package |
197 |
> provides for verification of partially fetched binary packages. |
198 |
> |
199 |
> +However, signing individual files does not guarantee that all members |
200 |
> +are originating from the same binary package. This opens up the |
201 |
|
202 |
Here too. |
203 |
|
204 |
> +possibility of a replacement/reuse attack, e.g. combining the signed |
205 |
> +metadata from foo-1.1 with signed image from foo-1.0. The new binary |
206 |
> +package passes the signature check. To prevent this type of attack, |
207 |
> +we need the additional Menifest file and its signature to verify the |
208 |
|
209 |
...and here. |
210 |
|
211 |
> +authenticity of the complete binary package. |
212 |
> + |
213 |
> |
214 |
> Format versioning |
215 |
> ----------------- |
216 |
> @@ -564,10 +633,19 @@ References |
217 |
> .. [#TAR-PORTABILITY] Michał Górny, Portability of tar features |
218 |
> (https://dev.gentoo.org/~mgorny/articles/portability-of-tar-features.html) |
219 |
> |
220 |
> +.. [#GLEP74] GLEP 74: Full-tree verification using Manifest files |
221 |
> + (https://www.gentoo.org/glep/glep-0074.html) |
222 |
> + |
223 |
> .. [#XPAK2GPKG] xpak2gpkg: Proof-of-concept converter from tbz2/xpak |
224 |
> to gpkg binpkg format |
225 |
> (https://github.com/mgorny/xpak2gpkg) |
226 |
> |
227 |
> +.. [#TARDUP] tar: Multiple Members with the Same Name |
228 |
> + (https://www.gnu.org/software/tar/manual/html_node/multiple.html) |
229 |
> + |
230 |
> +.. [#ISSUE21109] Python tarfile: Traversal attack vulnerability |
231 |
> + (https://bugs.python.org/issue21109) |
232 |
> + |
233 |
> |
234 |
> Copyright |
235 |
> ========= |
236 |
> -- |
237 |
> 2.35.1 |
238 |
|
239 |
-- |
240 |
Best regards, |
241 |
Michał Górny |