Gentoo Archives: gentoo-commits

From: "Michał Górny" <mgorny@g.o>
To: gentoo-commits@l.g.o
Subject: [gentoo-commits] data/glep:glep-manifest commit in: /
Date: Mon, 20 Nov 2017 18:41:06
Message-Id: 1511203241.9d819c9a981416936dcda2f55e54ea70e494e59e.mgorny@gentoo
1 commit: 9d819c9a981416936dcda2f55e54ea70e494e59e
2 Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
3 AuthorDate: Mon Nov 20 18:40:41 2017 +0000
4 Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
5 CommitDate: Mon Nov 20 18:40:41 2017 +0000
6 URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=9d819c9a
7
8 glep-0074: Disallow filenames containing whitespace
9
10 glep-0074.rst | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
11 1 file changed, 52 insertions(+)
12
13 diff --git a/glep-0074.rst b/glep-0074.rst
14 index f96a58e..46ad9fe 100644
15 --- a/glep-0074.rst
16 +++ b/glep-0074.rst
17 @@ -132,6 +132,13 @@ are not otherwise ignored reside on a different filesystem, or symbolic
18 links point to targets on a different filesystem, they must
19 be explicitly excluded via ``IGNORE``.
20
21 +All paths specified in the Manifest file must consist of characters
22 +corresponding to valid UTF-8 code points excluding the NULL character
23 +(``U+0000``) and characters classified as whitespace in the current
24 +version of the Unicode standard [#UNICODE]_. It is an error to use
25 +Manifest files in directories containing files whose names contain
26 +the disallowed characters.
27 +
28
29 File verification
30 -----------------
31 @@ -542,6 +549,45 @@ In particular, tools might then claim that a file does not exist when
32 it clearly does because it was skipped due to filesystem boundaries.
33
34
35 +Filename character set restriction
36 +----------------------------------
37 +
38 +The valid set of filename characters for the Gentoo repository
39 +is restricted by the devmanual 'File Naming Rules' section
40 +[#FILE-NAMING-RULES]_, and enforced via a git hook. The valid distfile
41 +names are not restricted explicitly -- however, the PMS dependency
42 +specification syntax [#PMS-FETCH]_ implicitly makes it impossible to use
43 +filenames containing whitespace.
44 +
45 +This specification aims to avoid arbitrary restrictions. For this
46 +reason, the filename characters are only restricted by excluding two
47 +technically problematic groups:
48 +
49 +1. The NULL character (``U+0000``) is normally used to indicate the end
50 + of a null-terminated string. Its use could therefore break programs
51 + written using C. Furthermore, it is not allowed in any known
52 + filesystem.
53 +
54 +2. The whitespace characters are used to separate Manifest fields. While
55 + technically it would be enough to restrict space (``U+0020``)
56 + character that is normally used as the separator, all whitespace
57 + characters are forbidden to avoid confusion and implementation
58 + errors.
59 +
60 +While the specification could be extended to allow such filenames
61 +by using some form of escaping, there is currently no apparent need
62 +for such a feature.
63 +
64 +Historically, Portage attempted to overcome the whitespace limitation
65 +by attempting to locate the size field and take everything before it
66 +as filename. This was terribly fragile and even if it worked, it would
67 +solve the problem only partially.
68 +
69 +Since the same restrictions apply to ``IGNORE`` rules, it is currently
70 +not possible to either list or ignore the file using whitespace
71 +characters. Therefore, the presence of such files is forbidden entirely.
72 +
73 +
74 File verification model
75 -----------------------
76
77 @@ -880,10 +926,16 @@ References
78 .. [#GLEP61] GLEP 61: Manifest2 compression
79 (https://www.gentoo.org/glep/glep-0061.html)
80
81 +.. [#UNICODE] The Unicode standard
82 + (https://unicode.org/versions/latest/)
83 +
84 .. [#PMS-FETCH] Package Manager Specification: Dependency Specification
85 Format - SRC_URI
86 (https://projects.gentoo.org/pms/6/pms.html#x1-940008.2.10)
87
88 +.. [#FILE-NAMING-RULES] Ebuild File Format -- Gentoo Development Guide
89 + (https://devmanual.gentoo.org/ebuild-writing/file-format/#file-naming-rules)
90 +
91 .. [#MD5] RFC1321: The MD5 Message-Digest Algorithm
92 (https://www.ietf.org/rfc/rfc1321.txt)