Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [RFC] GLEP 74: Full-tree verification using Manifest files
Date: Sat, 28 Oct 2017 11:51:00
Message-Id: 1509191446.1791.24.camel@gentoo.org
In Reply to: Re: [gentoo-dev] [RFC] GLEP 74: Full-tree verification using Manifest files by "Robin H. Johnson"
1 W dniu pią, 27.10.2017 o godzinie 21∶05 +0000, użytkownik Robin H.
2 Johnson napisał:
3 > On Thu, Oct 26, 2017 at 10:12:25PM +0200, Michał Górny wrote:
4 > > 2. Alike the original Manifest2, the files should be split into two
5 > > groups — files whose authenticity is critical, and those whose
6 > > mismatch may be accepted in non-strict mode. The same classification
7 > > should apply both to files listed in Manifests, and to stray files
8 > > present only in the repository.
9 >
10 > nit: s/Alike/Like/, or rewrite the sentence.
11
12 Done.
13
14 > > Manifest file locations and nesting
15 > > -----------------------------------
16 > > The ``Manifest`` file located in the root directory of the repository
17 > > is called top-level Manifest, and it is used to perform the full-tree
18 > > verification. In order to verify the authenticity, it must be signed
19 > > using OpenPGP, using the armored cleartext format.
20 >
21 > Are detached signatures also permitted (for all levels of Manifest)?
22
23 I'd say no. Keeping it always contained in a single file is simpler.
24
25 > > The Manifest files can also specify ``IGNORE`` entries to skip Manifest
26 > > verification of subdirectories and/or files. Files and directories
27 > > starting with a dot are always implicitly ignored. All files that
28 > > are not ignored must be covered by at least one of the Manifests.
29 >
30 > Do we need to keep that implicit ignore rule? Rather, convert it to being
31 > always explicit.
32 >
33 > There is only one such file in the rsync checkout presently:
34 > metadata/.checksum-test-marker (see bug #572168, it is used to detect
35 > mis-configured mirrors).
36 >
37 > A SVN or Git repo might also have dot-named directories.
38
39 I like the implicit idea better as it is more consistent with normal
40 tool behavior, like 'ls' not listing the files. Dotfiles can be created
41 by many random tools or even the filesystem (especially in some cases
42 of overlay filesystems).
43
44 That said, the case of 'lost+found' just occurred to me. I suppose this
45 one we will want to always IGNORE.
46
47 > > All the files covered by a Manifest tree must reside on the same
48 > > filesystem. It is an error to specify entries applying to files
49 > > on another filesystem. If subdirectories of the Manifest tree reside
50 > > on a different filesystem, they must be explicitly excluded
51 > > via ``IGNORE``.
52 >
53 > Distfiles aren't required to be in the same filesystem.
54
55 I've updated the sentence to clearly indicate it's about «local (non-
56 ``DIST``) files».
57
58 >
59 > > New Manifest tags
60 > > -----------------
61 >
62 > ...
63 > > ``IGNORE <path>``
64 > > Ignores a subdirectory or file from Manifest checks. If the specified
65 > > path is present, it and its contents are omitted from the Manifest
66 > > verification (always pass).
67 >
68 > Should this be accepted even by strict-mode? Alternatively, should strict mode
69 > require that other content is kept outside of the tree?
70
71 Yes, it should. I'd really prefer if strict mode still worked out-of-
72 the-box for most of our users without requiring them to do major
73 reshuffling of their systems.
74
75 Plus, see 'lost+found' above.
76
77 > > Algorithm for full-tree verification
78 > > ------------------------------------
79 >
80 > ...
81 > > 2. Start at the top-level Manifest file. Verify its OpenPGP signature.
82 > > Optionally verify the ``TIMESTAMP`` entry if present. Remove
83 > > the top-level Manifest from the *present* set.
84 >
85 > This spec does not state how the timestamp should be verified.
86 > Borrow from the original GLEP?
87
88 Let's try:
89
90 | 2. Start at the top-level Manifest file. Verify its OpenPGP signature.
91 | Optionally verify the ``TIMESTAMP`` entry if present.
92 | If the timestamp is significantly out of date compared to the local
93 | clock or a trusted source, halt or require manual intervention
94 | from the user. Remove the top-level Manifest from the *present* set.
95
96 Maybe it would look better if I split it into sub-points.
97
98 >
99 > > 4. Process all ``IGNORE`` entries. Remove any paths matching them
100 > > from the *present* set.
101 > >
102 > > 5. Collect all files covered by ``DATA``, ``MISC``, ``OPTIONAL``,
103 > > ``EBUILD`` and ``AUX`` entries into the *covered* set.
104 >
105 > Clarification request: point out again in this section, that IGNORE entries are
106 > prohibited from also matching any other entry. It is mentioned further up, but
107 > a reminder is good.
108
109 I've added an extra step:
110
111 | 6. Verify the entries in *covered* set for incompatible duplicates
112 | and collisions with ignored files as explained in `Manifest file
113 | locations and nesting`_.
114
115 >
116 > > Checksum algorithms
117 > > -------------------
118 > > This section is informational only. Specifying the exact set
119 > > of supported algorithms is outside the scope of this specification.
120 >
121 > ...
122 > > The method of introducing new hashes is defined by GLEP 59 [#GLEP59]_.
123 > > It is recommended that any new hashes are named after the Python
124 > > ``hashlib`` module algorithm names, transformed into uppercase.
125 >
126 > Would we ever consider algorithm parameters? Yes, outside of this spec, but checking anyway.
127
128 I can't say for sure but so far I've went for 'no'. That's why gemato
129 does not support e.g. SHAKE* algorithms. If we ever decide to do that,
130 I suppose we can do it inside hash name, e.g. FOO-<param1>-<param2>...
131
132 >
133 > > Manifest compression
134 > > --------------------
135 >
136 > ...
137 > > The specification permits uncompressed Manifests to exist alongside
138 > > their compressed counterparts, and multiple compressed formats
139 > > to coexist. If that is the case, the files must have the same
140 > > uncompressed content and the specification is free to choose either
141 > > of the files using the same base name.
142 >
143 > GLEP61, for the transition period, required compressed & uncompressed Manifests
144 > in the same directory to have identical content. Include mention of that here.
145
146 Can do. But I'll do it in 'Backwards compatibility' section:
147
148 | - if the Manifest files inside the package directory are compressed,
149 | a uncompressed file of identical content must coexist.
150
151 > Saying that either can be used is a potential issue.
152
153 Why? It also says that they must be identical, so it's of no difference
154 to the implementation which one is used.
155
156 > > Tree design
157 > > -----------
158 >
159 > ...
160 >
161 > Add a minor header here, to say this is the end of the 'Tree design' section?
162
163 It's not the end, it's description of the alternative. Both belong
164 in one section. I could add additional section level but I'd rather
165 not do that for a single use.
166
167 > > In the independent model, each sub-Manifest file is independent
168 > > of the parent Manifests. As a result, each of them needs to be signed
169 > > and verified independently. However, the parent Manifests still need
170 > > to list sub-Manifests (albeit without verification data) in order
171 > > to detect removal or replacement of subdirectories. This has
172 > > the following implications:
173 >
174 > ...
175 >
176 > > File verification model
177 > > -----------------------
178 > >
179 > > The verification model aims to provide full coverage against different
180 > > forms of attack. In particular, three different kinds of manipulation
181 > > are considered:
182 > > ...
183 >
184 > Selective denial of syncing was also one of the attacks in the original GLEPs
185 > that was considered. See details re timestamp below.
186
187 But that's not covered by 'file verification model', is it? So I suppose
188 it's better to detail it below.
189
190 >
191 > > Timestamp field
192 > > ---------------
193 > >
194 > > The top-level Manifests optionally allows using a ``TIMESTAMP`` tag
195 > > to include a generation timestamp in the Manifest. A similar feature
196 > > was originally proposed in GLEP 58 [#GLEP58]_.
197 > >
198 > > The timestamp can be used to detect delay or replay attacks against
199 > > Gentoo mirrors.
200 > >
201 > > Strictly speaking, this is already provided by the various
202 > > ``metadata/timestamp.*`` files provided already by Gentoo which are also
203 > > covered by the Manifest. However, including the value in the Manifest
204 > > itself has a little cost and provides the ability to perform
205 > > the verification stand-alone.
206 >
207 > There's a critical part of the GLEP58 spec that got missed here:
208 > https://www.gentoo.org/glep/glep-0058.html#timestamps-additional-distribution-of-metamanifest
209 > The timestamp needs to be usable to verify if the mirror is update to date vs
210 > known masters.
211 >
212 > The attack being defended against is that local community mirror (or MITM)
213 > isn't deliberately handing them an unmodified but stale copy of the tree.
214 >
215 > I do approve of changing the format of the tag; but it still needs to be
216 > linkable to a more verifiable source of truth,
217
218 I've tried to expand it a bit without getting too specific. New content
219 for paragraphs 2+:
220
221 | A malicious third-party may use the principles of exclusion and replay
222 | to deny an update to clients, while at the same time recording
223 | the identity of clients to attack. The timestamp field can be used
224 | to detect that.
225 |
226 | In order to provide a more complete protection, the Gentoo
227 | Infrastructure should provide an ability to obtain the timestamps
228 | of all Manifests from a recent timeframe over a secure channel
229 | for comparison.
230 |
231 | Strictly speaking, this is already provided by the various
232 | ``metadata/timestamp.*`` files provided already by Gentoo which are also
233 | covered by the Manifest. However, including the value in the Manifest
234 | itself has a little cost and provides the ability to perform
235 | the verification stand-alone.
236
237 > > Backwards Compatibility
238 > > =======================
239 > >
240 > > This GLEP provides optional means of preserving backwards compatibility.
241 > > To preserve the backwards compatibility, the following needs to be
242 > > ensured:
243 > >
244 > > - all files within the package directory must be covered by ``Manifest``
245 > > file inside that package directory,
246 >
247 > This implies that IGNORE entries are NOT permitted to cover any file in
248 > a package directory during the transition period.
249
250 Well, obviously you can't use new tags in those files and rely on they
251 working correctly.
252
253 > >
254 > > - all distfiles used by the package must be covered by ``Manifest``
255 > > file inside the package directory,
256 >
257 > This implies that non-package-dir DIST entries may be a duplicate of a
258 > package-level DIST during the transition.
259
260 Yes, that's permitted if they're compatible.
261
262 > > - all files inside the ``files/`` subdirectory of a package directory
263 > > need to be use the deprecated ``AUX`` tag (rather than ``DATA``),
264 > >
265 > > - all ``.ebuild`` files inside the package directory need to use
266 > > the deprecated ``EBUILD`` tag (rather than ``DATA``),
267 >
268 > Could we please note here, for the transitional period, that the
269 > file equivalence rule is applicable?
270 > During the transitional, the package Manifests may contain two entries for a
271 > given file: (DATA, EBUILD) or (DATA, AUX). The MISC type remains the
272 > same.
273
274 Equivalence rule is applicable always. However, there's no point
275 in duplicating the entry for the same file as that's only going
276 to increase space use.
277
278 > > - the Manifest files inside the package directory can be signed
279 > > to provide authenticity verification.
280 >
281 > Why do we need to specify this in backwards compat, it's still permitted above.
282
283 But it makes no sense when top-level Manifest is signed. This points out
284 that for tools not supporting full-tree verification smaller signatures
285 need to be used (skipping the fact that Portage did not ever implement
286 it).
287
288 Updated the two linked files.
289
290 --
291 Best regards,
292 Michał Górny

Replies