1 |
>>>>> On Mon, 20 Nov 2017, Michał Górny wrote: |
2 |
|
3 |
> New changes: |
4 |
|
5 |
> 9d819c9 glep-0074: Disallow filenames containing whitespace |
6 |
> 4124b2f glep-0074: Explicitly specify UTF-8 encoding |
7 |
> 7f9bd9f glep-0074: Include suggestions from Daniel Campbell |
8 |
|
9 |
Here are a few comments (quoting below only the parts of the text |
10 |
referenced by them): |
11 |
|
12 |
> The Manifest files use UTF-8 encoding. |
13 |
|
14 |
I don't understand the purpose of that requirement. The only place |
15 |
where bytes outside of the ASCII range can occur are names of |
16 |
distfiles, and these should simply be passed transparently. Otherwise, |
17 |
you would have to reject any sequence of non-ASCII bytes that doesn't |
18 |
form a valid UTF-8 sequence, which looks like an arbitrary restriction |
19 |
to me. |
20 |
|
21 |
> It is an error for a single file to be matched by multiple entries |
22 |
> of different semantics, file size or checksum values. It is an error |
23 |
> to specify another entry for a file matching ``IGNORE``, or one of its |
24 |
> subdirectories. |
25 |
|
26 |
What about regular files in a directory (or subdirectory) matched by |
27 |
IGNORE? Looks like this case is not covered (?). |
28 |
|
29 |
> All paths specified in the Manifest file must consist of characters |
30 |
> corresponding to valid UTF-8 code points excluding the NULL character |
31 |
> (``U+0000``) and characters classified as whitespace in the current |
32 |
> version of the Unicode standard [#UNICODE]_. It is an error to use |
33 |
> Manifest files in directories containing files whose names contain |
34 |
> the disallowed characters. |
35 |
|
36 |
See above. I believe that NUL and ASCII whitespace (i.e. characters 09 |
37 |
0a 0b 0c 0d 20) should be excluded, but excluding byte sequences like |
38 |
"e1 9a 80" (which is the UTF-8 encoding for U+1680 "OGHAM SPACE MARK") |
39 |
doesn't make sense. |
40 |
|
41 |
> During the verification process, the client should compare the timestamp |
42 |
> against the update time obtained from a local clock or a trusted time |
43 |
> source. If the comparison result indicates that the Manifest at the time |
44 |
> of receiving was already significantly outdated, the client should |
45 |
> either fail the verification or require manual confirmation from user. |
46 |
|
47 |
s/from user./from the user./ |
48 |
|
49 |
> ``TIMESTAMP <iso8601>`` |
50 |
> Specifies a timestamp of when the Manifest file was last updated. |
51 |
> The timestamp must be a valid second-precision ISO8601 extended format |
52 |
|
53 |
s/ISO8601/ISO 8601/ |
54 |
|
55 |
> ``IGNORE <path>`` |
56 |
> Ignores a subdirectory or file from Manifest checks. If the specified |
57 |
> path is present, it and its contents are omitted from the Manifest |
58 |
> verification (always pass). *Path* must be a plain file or directory |
59 |
> path without a trailing slash, and must not contain wildcards. |
60 |
|
61 |
What does that mean? Wildcards are not special (so "foo*" will match |
62 |
literally), or wildcard characters like "*" are not allowed at all? |
63 |
|
64 |
> ``AUX <filename> <size> <checksums>...`` |
65 |
> Equivalent to the ``DATA`` type, except that the filename is relative |
66 |
> to ``files/`` subdirectory. |
67 |
|
68 |
s/to/to the/ |
69 |
|
70 |
> 3. Process all ``MANIFEST`` entries, recursively. Verify the Manifest |
71 |
> files according to `file verification`_ section, and include their |
72 |
|
73 |
s/according to/according to the/ |
74 |
|
75 |
> 6. Verify the entries in *covered* set for incompatible duplicates |
76 |
|
77 |
s/in *covered* set/in the *covered* set/ |
78 |
|
79 |
> 7. Verify all the files in the union of the *present* and *covered* |
80 |
> sets, according to `file verification`_ section. |
81 |
|
82 |
s/to/to the/ |
83 |
|
84 |
> a. If a ``IGNORE`` entry in the ``Manifest`` file covers |
85 |
> the *original* directory (or one of the parent directories), stop. |
86 |
|
87 |
s/a ``IGNORE`` entry/an ``IGNORE`` entry/ |
88 |
|
89 |
> An example top-level Manifest file for the Gentoo repository would have |
90 |
> the following content:: |
91 |
|
92 |
> TIMESTAMP 2017-10-30T10:11:12Z |
93 |
> IGNORE distfiles |
94 |
> IGNORE local |
95 |
> IGNORE lost+found |
96 |
> IGNORE packages |
97 |
> MANIFEST app-accessibility/Manifest 14821 SHA256 1b5f.. SHA512 f7eb.. |
98 |
> ... |
99 |
> MANIFEST eclass/Manifest.gz 50812 SHA256 8c55.. SHA512 2915.. |
100 |
> ... |
101 |
|
102 |
> An example modern Manifest (disregarding backwards compatibility) |
103 |
> for a package directory would have the following content:: |
104 |
|
105 |
> DATA SphinxTrain-0.9.1-r1.ebuild 932 SHA256 3d3b.. SHA512 be4d.. |
106 |
> DATA SphinxTrain-1.0.8.ebuild 912 SHA256 f681.. SHA512 0749.. |
107 |
> DATA metadata.xml 664 SHA256 97c6.. SHA512 1175.. |
108 |
> DATA files/gcc.patch 816 SHA256 b56e.. SHA512 2468.. |
109 |
> DATA files/gcc34.patch 333 SHA256 c107.. SHA512 9919.. |
110 |
> DIST SphinxTrain-0.9.1-beta.tar.gz 469617 SHA256 c1a4.. SHA512 1b33.. |
111 |
> DIST sphinxtrain-1.0.8.tar.gz 8925803 SHA256 548e.. SHA512 465d.. |
112 |
|
113 |
Update hashes to BLAKE2B SHA512? |
114 |
|
115 |
> This specification aims to avoid arbitrary restrictions. For this |
116 |
> reason, the filename characters are only restricted by excluding two |
117 |
|
118 |
s/the filename characters/filename characters/ |
119 |
|
120 |
> technically problematic groups: |
121 |
|
122 |
> 1. The NULL character (``U+0000``) is normally used to indicate the end |
123 |
> of a null-terminated string. Its use could therefore break programs |
124 |
> written using C. Furthermore, it is not allowed in any known |
125 |
> filesystem. |
126 |
|
127 |
> 2. The whitespace characters are used to separate Manifest fields. While |
128 |
|
129 |
s/The whitespace characters/Whitespace characters/ |
130 |
|
131 |
> 2. being able to run update automatically generated files locally |
132 |
> without causing unnecessary verification failures. |
133 |
|
134 |
Strike the word "run"? |
135 |
|
136 |
> Strictly speaking, this information is already provided by the various |
137 |
> ``metadata/timestamp*`` files that are already present. However, |
138 |
|
139 |
Twice "already" in this sentence. |
140 |
|
141 |
> The OpenPGP cleartext signature covers the contents of the Manifest, |
142 |
> and is therefore compressed along with them. The possibility of using |
143 |
> detached signature has been considered but it was rejected as |
144 |
|
145 |
s/detached signature/a detached signature/ |
146 |
|
147 |
> The existence of additional entries for uncompressed Manifest checksums |
148 |
> was debated. However, plain entries for the uncompressed file would |
149 |
> be confusing if only the compressed file existed, and conflicting |
150 |
> if both uncompressed and compressed variants existed. Furthermore, |
151 |
> it has been pointed out that ``DIST`` entries do not have uncompressed |
152 |
> variant either. |
153 |
|
154 |
s/uncompressed variant/an uncompressed variant/ |
155 |
|
156 |
> .. [#DIST] According to Robin H. Johnson, 8.4% of all DIST entries |
157 |
> at the time of writing are duplicate, representing a 2 MiB |
158 |
> out of 25 MiB of DIST entries altogether. |
159 |
|
160 |
s/a 2 MiB/2 MiB/ |
161 |
|
162 |
> Copyright |
163 |
> ========= |
164 |
|
165 |
There should be two blank lines before this section heading (as |
166 |
required by GLEP 2). |
167 |
|
168 |
Ulrich |