1 |
commit: d39f865f5bbad9523ad6c2cfd06af95d9fa7d402 |
2 |
Author: Michał Górny <mgorny <AT> gentoo <DOT> org> |
3 |
AuthorDate: Thu Nov 23 18:44:54 2017 +0000 |
4 |
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org> |
5 |
CommitDate: Thu Nov 23 18:44:54 2017 +0000 |
6 |
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=d39f865f |
7 |
|
8 |
glep-0074: Make extended filename encoding optional |
9 |
|
10 |
glep-0074.rst | 18 ++++++++++++++++-- |
11 |
1 file changed, 16 insertions(+), 2 deletions(-) |
12 |
|
13 |
diff --git a/glep-0074.rst b/glep-0074.rst |
14 |
index 6db6caa..5270b7a 100644 |
15 |
--- a/glep-0074.rst |
16 |
+++ b/glep-0074.rst |
17 |
@@ -142,8 +142,15 @@ corresponding to valid UTF-8 code points excluding the backwards slash |
18 |
(``\``) and characters classified as control characters and whitespace |
19 |
in the current version of the Unicode standard [#UNICODE]_. |
20 |
|
21 |
-Any of the excluded characters that are present in path must be encoded |
22 |
-using one of the following escape sequences: |
23 |
+The implementation can optionally support extended filename encoding |
24 |
+to support those paths. If the encoding is not supported, |
25 |
+the implementation must reject directories containing any files using |
26 |
+non-compliant names, as well as Manifest files whose filename field |
27 |
+contains such filenames. |
28 |
+ |
29 |
+If the encoding is supported, then all of the excluded characters that |
30 |
+are present in path must be encoded using one of the following escape |
31 |
+sequences: |
32 |
|
33 |
- characters in the ``U+0000`` to ``U+007F`` range can be encoded |
34 |
as ``\xHH`` where ``HH`` specifies the zero-padded, hexadecimal |
35 |
@@ -615,6 +622,13 @@ by attempting to locate the size field and take everything before it |
36 |
as filename. This was terribly fragile and even if it worked, it would |
37 |
solve the problem only partially. |
38 |
|
39 |
+To preserve compatibility with the current implementations and given |
40 |
+that all of the listed characters are not allowed for the foreseeable |
41 |
+Gentoo uses, the extended encoding support is optional. If such support |
42 |
+is not provided, the implementation must unconditionally reject any |
43 |
+such files. Ignoring them implicitly would be confusing, and it is |
44 |
+not possible to use them in explicit ``IGNORE`` entries. |
45 |
+ |
46 |
The character encoding method provides means to overcome the character |
47 |
restrictions to extend the tool usability beyond immediate Gentoo uses. |
48 |
The backslash escape form based on Python unicode strings is used |