Gentoo Archives: gentoo-dev

From: "Robin H. Johnson" <robbat2@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [v1.0.2] GLEP 74: Full-tree verification using Manifest files
Date: Mon, 30 Oct 2017 19:56:39
Message-Id: robbat2-20171030T184928-290641978Z@orbis-terrarum.net
In Reply to: Re: [gentoo-dev] [v1.0.2] GLEP 74: Full-tree verification using Manifest files by "Michał Górny"
1 On Mon, Oct 30, 2017 at 05:51:36PM +0100, Michał Górny wrote:
2 ...
3 > Directory tree coverage
4 > -----------------------
5 This section should maybe cover OPTIONAL in more detail (see more
6 below).
7
8 > The Manifest files can also specify ``IGNORE`` entries to skip Manifest
9 > verification of subdirectories and/or files. The package manager can
10 > support injecting ignore paths to account for additional files created,
11 > modified or removed by user's processes that would not be ignored
12 > by existing rules. Files and directories starting with a dot are always
13 > implicitly ignored. All files that are not ignored must be covered
14 > by at least one of the Manifests.
15 (English) There are multiple points in this paragraph, and I missed on first
16 reading. The package manager part is esp. lost.
17 || All files that are not ignored must be covered by at least one of the
18 || Manifests. Files may be ignored by several ways:
19 || - Files and directories starting with a dot are always implicitly
20 || ignored.
21 || - The Manifest files can specify ``IGNORE`` entries to skip
22 || verification of ubdirectories and/or files.
23 || - The package manager can support injecting ignore paths to account for
24 || additional files created modified or removed by user's processes that
25 || would not be ignored by existing rules.
26
27 > File verification
28 > -----------------
29 > When verifying a file against the Manifest, the following rules are
30 > used:
31 ...
32 > 3. If the file is covered by an entry of the ``OPTIONAL`` type:
33 > a. if the file is present, then the verification fails,
34 > b. otherwise, the verification succeeds.
35 Move the OPTIONAL type further up in the verification list maybe? See
36 the interpretation question below.
37
38
39 > Modern Manifest tags
40 > --------------------
41 ...
42 > ``IGNORE <path>``
43 > Ignores a subdirectory or file from Manifest checks. If the specified
44 > path is present, it and its contents are omitted from the Manifest
45 > verification (always pass).
46 Clarification Needed:
47 Should subdirectories have a trailing slash in the Manifest or not?
48 This affects matching of the type.
49 Case 1.1:
50 Manifest has 'IGNORE foo'; 'foo' is a file; => ignored.
51 Case 1.2:
52 Manifest has 'IGNORE foo'; 'foo' is a directory; => ignored.
53 Case 2.1:
54 Manifest has 'IGNORE foo/'; 'foo' is a file; => FAIL
55 Case 2.2:
56 Manifest has 'IGNORE foo/'; 'foo' is a directory; => ignored.
57
58
59 > ``OPTIONAL <path>``
60 > Specifies a file that does not exist in the distribution but if it
61 > did, it would be marked as ``MISC``. In the strict mode, the file
62 > must not exist for the verification to pass. The package manager
63 > may ignore a stray file matching this entry if operating in non-strict
64 > mode.
65 This has gotten less clear.
66 Is the following correct interpretation?
67 if(package manager is strict) then
68 Treat the OPTIONAL entry as NOT present in the Manifest.
69 This will cause files to be in the present set but not the covered set.
70 else
71 Treat the OPTIONAL entry as 'IGNORE <path>'
72 endif
73
74 > ``DIST <filename> <size> <checksums>…``
75 > Specifies a distfile entry used to verify files fetched as part
76 > of ``SRC_URI``. The filename must match the filename used to store
77 > the fetched file as specified in the PMS [#PMS-FETCH]_. The package
78 > manager must reject the fetched file if it fails verification.
79 > ``DIST`` entries apply to all packages below the Manifest file
80 > specifying them.
81 Nit: You have used a unicode ellipsis '…' in some places and plain ASCII
82 ellipsis '...' in others. Stick to ASCII?
83
84
85 > An example Manifest file (informational)
86 > ----------------------------------------
87 Can you add an example for OPTIONAL?
88
89 > Tree layout restrictions
90 > ------------------------
91 > The Gentoo repository does not use symbolic links. Some Gentoo
92 > repositories do, however. To provide a simple solution for dealing with
93 > symlinks without having to take care to implement special handling for
94 > them, the common behavior of implicitly resolving them is used.
95 > Therefore, symbolic links to files are stored as if they were regular
96 > files, and symbolic links to directories are followed as if they were
97 > regular directories.
98 Clarification: should cross-device symlinks be rejected? (perhaps
99 implicit, but wanted to check)
100 If so, need to add to 'Algorithm for full-tree verification' section.
101
102 > Dotfiles are implicitly ignored as that is a common notion used
103 > in software written for POSIX systems. All other filenames require
104 > explicit ``IGNORE`` lines.
105 This paragraph should re-iterate that the package manager may specify
106 additional files to be ignored per the user.
107
108 > The algorithm is restricted to work on a single filesystem. This is
109 > mostly relevant when scanning for top-level Manifest — we do not want
110 > to cross filesystem boundaries then. However, to ensure consistent
111 > bidirectional behavior we need to also ban them when operating downwards
112 > the tree.
113 >
114 > The directories and files on different filesystems needs to be ignored
115 > explicitly as implicitly skipping them would cause confusion.
116 > In particular, tools might then claim that a file does not exist when
117 > it clearly does because it was skipped due to filesystem boundaries.
118 The downward path needs to check the device on files.
119 Otherwise:
120 cat/pn/Manifest
121 cat/pn/files/ <-- different filesystem here
122
123 > Non-obligatory Manifest verification
124 > ------------------------------------
125 ...
126 > The traditional ``MISC`` type is amended with a complementary
127 > ``OPTIONAL`` tag to account for files that are not provided
128 > in the specific repository. It aims to ensure that the same path would
129 > be non-fatal when provided by the repository but fatal when created
130 > by the user tooling.
131 Clarify the last sentence to be for strict mode only?
132
133 > Splitting distfile checksums from file checksums
134 > ------------------------------------------------
135 >
136 > Another problem with the current Manifest format is that the checksums
137 > for fetched files are combined with checksums for local files
138 > in a single file inside the package directory. It has been specifically
139 > pointed out that:
140 >
141 > - since distfiles are sometimes reused across different packages,
142 > the repeating checksums are redundant,
143 >
144 > - mirror admins were interested in the possibility of verifying all
145 > the distfiles with a single tool.
146 >
147 > This specification does not provide a clean solution to this problem.
148 > It technically permits moving ``DIST`` entries to higher-level Manifests
149 > but the usefulness of such a solution is doubtful.
150 Clarification of validity:
151 If cat/pn1 and cat/pn2 share 1000 DIST files; would it be valid to
152 have: the following:
153 cat/pn1/Manifest:MANIFEST ../Manifest.some-shared-name 1234 ...
154 cat/pn1/Manifest:DIST unique-pn1-dist.tgz 1234 ...
155 cat/pn2/Manifest:MANIFEST ../Manifest.some-shared-name 1234 ...
156 cat/pn2/Manifest:DIST unique-pn2-dist.tgz 1234 ...
157
158 > Performance considerations
159 > --------------------------
160 ...
161 > To improve speed on I/O and/or CPU-restrained systems even further,
162 > the algorithms can be easily extended to perform incremental
163 > verification. Given that rsync does not preserve mtimes by default,
164 > the tool can take advantage of mtime and Manifest comparisons to recheck
165 > only the parts of the repository that have changed.
166 Implementation note, not for GLEP addition:
167 If the package manager caches by filename,inode,mtime locally, it can
168 then avoid repeat-checking of the hashes (it only needs a stat),
169 provided that it is happy there was no local attacker who might perform
170 an in-place modification of a file (mtime&inode remain the same).
171
172 >
173 > Furthermore, the package manager implementations can restrict checking
174 > only to the parts of the repository that are actually being used.
175 >
176 >
177 > Backwards Compatibility
178 > =======================
179 >
180 > This GLEP provides optional means of preserving backwards compatibility.
181 > To preserve the backwards compatibility, the following needs to be
182 > ensured:
183 "package directory" is common to all of the items here, if you move that
184 to the list preamble, it's a lot cleaner to read.
185 || To preserve the backwards compatibility, the following needs to be
186 || ensured about package directories:
187 And cleanup the list:
188 s/in(side)? (that|the) package directory//
189 s/of a package directory//
190
191 --
192 Robin Hugh Johnson
193 Gentoo Linux: Dev, Infra Lead, Foundation Asst. Treasurer
194 E-Mail : robbat2@g.o
195 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
196 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies