1 |
On Thu, Nov 16, 2017 at 11:19:54AM +0100, Michał Górny wrote: |
2 |
> [snip] |
3 |
> Abstract |
4 |
> ======== |
5 |
> |
6 |
> This GLEP extends the Manifest file format to cover full-tree file |
7 |
> integrity and authenticity checks.The format aims to be future-proof, |
8 |
|
9 |
Missing a space after the first sentence, between "checks." and "The". |
10 |
|
11 |
> efficient and provide means of backwards compatibility. |
12 |
|
13 |
Could use an Oxford comma after "efficient", but it's a style choice. Up |
14 |
to you. |
15 |
|
16 |
> |
17 |
> |
18 |
> Motivation |
19 |
> ========== |
20 |
> |
21 |
> The Manifest files as defined by GLEP 44 [#GLEP44]_ provide the current |
22 |
> means of verifying the integrity of distfiles and package files |
23 |
> in Gentoo. Combined with OpenPGP signatures, they provide means to |
24 |
> ensure the authenticity of the covered files. However, as noted |
25 |
> in GLEP 57 [#GLEP57]_ they lack the ability to provide full-tree |
26 |
> authenticity verification as they do not cover any files outside |
27 |
> the package directory. In particular, they provide multiple ways |
28 |
> for a third party to inject malicious code into the ebuild environment. |
29 |
> |
30 |
> Historically, the topic of providing authenticity coverage for the whole |
31 |
> repository has been mentioned multiple times. The most noteworthy effort |
32 |
> are GLEPs 58 [#GLEP58]_ and 60 [#GLEP60]_ by Robin H. Johnson from 2008. |
33 |
> They were accepted by the Council in 2010 but have never been |
34 |
> implemented. When potential implementation work started in 2017, a new |
35 |
> discussion about the specification arose. It prompted the creation |
36 |
> of a competing GLEP that would provide a redesigned alternative to |
37 |
> the old GLEPs. |
38 |
|
39 |
No correction, but I really like the inclusion of history here. It gives |
40 |
the reader more context, should they have questions about prior |
41 |
discussions. |
42 |
|
43 |
> [snip] |
44 |
> 1. It is more future-proof. If an incompatible change to the repository |
45 |
> format is introduced, only developers need to be upgrade the tools |
46 |
> they use to generate the Manifests. The tools used to verify |
47 |
> the updated Manifests will continue to work. |
48 |
|
49 |
"be upgrade" -> "upgrade" |
50 |
|
51 |
> |
52 |
> [snip] |
53 |
> While both models have their advantages, the hierarchical model was |
54 |
> selected because it reduces the number of OpenPGP operations |
55 |
> which are comparatively costly to the minimum. |
56 |
|
57 |
It seems like "which are comparatively costly" should be in parentheses, |
58 |
or separated by some other punctuation like en- or em-dashes. e.g. |
59 |
|
60 |
"... because it reduces the number of OpenPGP operations (which are |
61 |
comparatively costly) to the minimum." |
62 |
|
63 |
or |
64 |
|
65 |
"... because it reduces the number of OpenPGP operations – which are |
66 |
comparatively costly – to the minimum." |
67 |
|
68 |
(Note en-dash was used (0x2013), not a regular hyphen (0x2D).) |
69 |
|
70 |
Or something like that. |
71 |
|
72 |
> |
73 |
> [snip] |
74 |
> Non-strict Manifest verification |
75 |
> -------------------------------- |
76 |
> |
77 |
> Originally the Manifest2 format provided a special ``MISC`` tag that |
78 |
> was used for ``metadata.xml`` and ``ChangeLog`` files. This tag |
79 |
> indicated that the Manifest verification failures could be ignored for |
80 |
> those files unless the package manager was working in strict mode. |
81 |
> |
82 |
> The first versions of this specification continued the use of this tag. |
83 |
> However, after a long debate it was decided to deprecate it along with |
84 |
> the non-strict behavior, and require all files to strictly match. |
85 |
|
86 |
It may be outside the scope of the GLEP, but a link to said long debate |
87 |
might be relevant to the reader, especially if they have suggestions or |
88 |
points that have already been discussed in the debate. |
89 |
|
90 |
> [snip] |
91 |
> Finally, the non-strict mode could be used as means to an attack. |
92 |
> The allowance of missing or modified documentation file could be used |
93 |
> to spread misinformation, resulting in bad decisions made by the user. |
94 |
> A modified file could also be used e.g. to exploit vulnerabilities |
95 |
> of an XML parser. |
96 |
|
97 |
"used e.g." -> "used, e.g." |
98 |
|
99 |
Helps it reflect the way it would be spoken. |
100 |
|
101 |
> |
102 |
> |
103 |
> Timestamp field |
104 |
> --------------- |
105 |
> |
106 |
> The top-level Manifests optionally allows using a ``TIMESTAMP`` tag |
107 |
> to include a generation timestamp in the Manifest. A similar feature |
108 |
> was originally proposed in GLEP 58 [#GLEP58]_. |
109 |
|
110 |
"Manifests" and "allows" disagree grammatically -- one of them needs to |
111 |
drop the "s". Context clues indicate a singular top-level Manifest. |
112 |
|
113 |
> |
114 |
> A malicious third-party may use the principles of exclusion or replay |
115 |
> [#C08]_ to deny an update to clients, while at the same time recording |
116 |
> the identity of clients to attack. The timestamp field can be used to |
117 |
> detect that. |
118 |
> |
119 |
> In order to provide a more complete protection, the Gentoo |
120 |
> Infrastructure should provide an ability to obtain the timestamps |
121 |
> of all Manifests from a recent timeframe over a secure channel |
122 |
> from a trusted source for comparison. |
123 |
|
124 |
"a more complete protection"; should probably drop the "a". |
125 |
|
126 |
> |
127 |
> Strictly speaking, this information is already provided by the various |
128 |
> ``metadata/timestamp*`` files that are already present. However, |
129 |
> including the value in the Manifest itself has a little cost |
130 |
> and provides the ability to perform the verification stand-alone. |
131 |
> |
132 |
> Furthermore, some of the timestamp files are added very late |
133 |
> in the distribution process, past the Manifest generation phase. Those |
134 |
> files will most likely receive ``IGNORE`` entries and therefore |
135 |
> be not suitable to safe use. |
136 |
|
137 |
Looks like a few extra words in the last sentence. Here's my attempt: |
138 |
|
139 |
"These files will likely receive ``IGNORE`` entries and therefore be |
140 |
unsafe to use." |
141 |
|
142 |
("unsuitable" may replace "unsafe", up to you) |
143 |
|
144 |
> |
145 |
> The specification permits additional timestamps in sub-Manifest files |
146 |
> for local use. A generic testing tool should ignore them. |
147 |
> |
148 |
> |
149 |
> New vs deprecated tags |
150 |
> ---------------------- |
151 |
> |
152 |
> Out of the four types defined by Manifest2, only one is reused |
153 |
> and the remaining three is replaced by a single, universal ``DATA`` |
154 |
> type. |
155 |
|
156 |
"the remaining three is" -> "the remaining three are" |
157 |
|
158 |
> [snip] |
159 |
> Injecting ChangeLogs into the checkout |
160 |
> -------------------------------------- |
161 |
> |
162 |
> One of the problems considered in the new Manifest format was that |
163 |
> of injecting historical and autogenerated ChangeLog into the repository. |
164 |
> Normally we are not including those files to reduce the checkout size. |
165 |
> However, some users have shown interest in them and Infra is working |
166 |
> on providing them via an additional rsync module. |
167 |
|
168 |
"that of" is extraneous here. |
169 |
|
170 |
The second sentence should read something like "We normally don't |
171 |
include these files, to reduce checkout size." |
172 |
|
173 |
> |
174 |
> [snip] |
175 |
> Hash algorithms |
176 |
> --------------- |
177 |
> |
178 |
> While maintaining a consistent supported hash set is important |
179 |
> for interoperability, it is no good fit for the generic layout of this |
180 |
> GLEP. Furthermore, it would require updating the GLEP in the future |
181 |
> every time the used algorithms change. |
182 |
|
183 |
"it is no good fit" -> "it is not a good fit" |
184 |
|
185 |
> |
186 |
> [snip] |
187 |
> The compression of top-level Manifest file has been prohibited |
188 |
> as the specification currently does not provide any means of verifying |
189 |
> the file prior to decompression. This would make it possibly for |
190 |
> a malicious third party to provide a compressed Manifest exposing |
191 |
> decompressor vulnerabilities, or being a zip bomb, and the tooling |
192 |
> would have to unpack it before being able to verify the contents. |
193 |
|
194 |
The latter half of the paragraph is a little scattered. Here's my |
195 |
attempt, after the first sentence: |
196 |
|
197 |
"If the top-level Manifest is compressed, tooling will have to unpack |
198 |
the file before being able to verify the contents. This makes it |
199 |
possible for a malicious third party to attack a system by providing a |
200 |
compressed Manifest that exposes decompressor vulnerabilities, or a zip |
201 |
bomb." |
202 |
|
203 |
(Maybe 'zip bomb' should be a link or a footnote, describing what it |
204 |
is.) |
205 |
|
206 |
> [snip] |
207 |
> |
208 |
> The existence of additional entries for uncompressed Manifest checksums |
209 |
> was debated. However, plain entries for the uncompressed file would |
210 |
> be confusing if only compressed file existed, and conflicting if both |
211 |
|
212 |
"only compressed" -> "only the compressed" |
213 |
|
214 |
> uncompressed and compressed variants existed. Furthermore, it has been |
215 |
> pointed out that ``DIST`` entries do not have uncompressed variant |
216 |
|
217 |
"uncompressed variant" -> "an uncompressed variant" |
218 |
|
219 |
> either. |
220 |
> |
221 |
> |
222 |
> Performance considerations |
223 |
> -------------------------- |
224 |
> |
225 |
> Performing a full-tree verification on every sync raises some |
226 |
> performance concerns for end-user systems. The initial testing has shown |
227 |
> that a cold-cache verification on a btrfs file system can take up around |
228 |
> 4 minutes, with the process being mostly I/O bound. On the other hand, |
229 |
> it can be expected that the verification will be performed directly |
230 |
> after syncing, taking advantage of warm filesystem cache. |
231 |
|
232 |
"warm" -> "a warm" |
233 |
|
234 |
> |
235 |
> [snip] |
236 |
> Thanks to all the people whose contributions were invaluable |
237 |
> to the creation of this GLEP. This includes but is not limited to: |
238 |
> |
239 |
> - Robin Hugh Johnson, |
240 |
> - Ulrich Müller. |
241 |
> |
242 |
> Additionally, thanks to Robin Hugh Johnson for the original |
243 |
> MataManifest GLEP series which served both as inspiration and source |
244 |
|
245 |
"MataManifest" -> "MetaManifest" |
246 |
|
247 |
> |
248 |
> [snip] |
249 |
> |
250 |
|
251 |
Aside from the few nitpicks this looks good. Hope this helps. |
252 |
|
253 |
-- |
254 |
Daniel Campbell - Gentoo Developer, Trustee, Treasurer |
255 |
OpenPGP Key: 0x1EA055D6 @ hkp://keys.gnupg.net |
256 |
fpr: AE03 9064 AE00 053C 270C 1DE4 6F7A 9091 1EA0 55D6 |