Gentoo Archives: gentoo-dev

From: Ulrich Mueller <ulm@g.o>
To: Ulrich Mueller <ulm@g.o>
Cc: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2]
Date: Tue, 21 Nov 2017 06:30:27
Message-Id: 23059.51193.378762.346025@a1i15.kph.uni-mainz.de
In Reply to: Re: [gentoo-dev] [RFC] GLEP 74 post-Council review update [v2] by Ulrich Mueller
1 >>>>> On Mon, 20 Nov 2017, Ulrich Mueller wrote:
2
3 >>>>> On Mon, 20 Nov 2017, Michał Górny wrote:
4 >> All paths specified in the Manifest file must consist of characters
5 >> corresponding to valid UTF-8 code points excluding the NULL character
6 >> (``U+0000``) and characters classified as whitespace in the current
7 >> version of the Unicode standard [#UNICODE]_. It is an error to use
8 >> Manifest files in directories containing files whose names contain
9 >> the disallowed characters.
10
11 > See above. I believe that NUL and ASCII whitespace (i.e. characters
12 > 09 0a 0b 0c 0d 20) should be excluded, but excluding byte sequences
13 > like "e1 9a 80" (which is the UTF-8 encoding for U+1680 "OGHAM SPACE
14 > MARK") doesn't make sense.
15
16 Thinking about it, this still looks too complicated. So, exclude only
17 SPACE (0x20) which is used as separator between fields. (NUL can be
18 excluded too, but it won't occur anyway.)
19
20 In fact, all Manifest files in the tree are ASCII only.
21 So alternatively, filenames could be restricted to printable ASCII.
22 This is also what GLEP 31 [1] says:
23
24 | Suitable Characters for File and Directory Names
25 |
26 | Characters outside the ASCII 0..127 range cannot safely be used for
27 | file or directory names. (Of course, not all characters inside the
28 | ASCII 0..127 range can be used safely either.)
29
30 Ulrich
31
32
33 [1] Character Sets for Portage Tree Items
34 https://www.gentoo.org/glep/glep-0031.html