1 |
On 11/02/2015 09:06 AM, Zac Medico wrote: |
2 |
> On 11/02/2015 08:52 AM, Mike Frysinger wrote: |
3 |
>> On 01 Nov 2015 09:36, Zac Medico wrote: |
4 |
>>> In order to handle python3 with arguments containing UTF-8 characters |
5 |
>>> (in ${PKGDIR}) and a mis-matched sys.getfilesystemencoding() value, it's |
6 |
>>> safest to decode the arguments like chmod-lite.py does. |
7 |
>> |
8 |
>> it seems wrong that we have incomplete coverage here. |
9 |
>> some tools do it and some do not. |
10 |
> |
11 |
> Yeah, complete coverage would be nice. Most if not all of the python |
12 |
> helpers that are called from the ebuild environment already use this |
13 |
> method to decode filename arguments (dohtml.py was only fixed recently, |
14 |
> for bug 561846). |
15 |
> |
16 |
>>> We should create |
17 |
>>> a function for this code which is also duplicated in install.py: |
18 |
>> |
19 |
>> you mean portage._decode_argv ? |
20 |
> |
21 |
> Yes, I forgot about that function. |
22 |
> |
23 |
>> what if we create a new module like "commandline" that provides an |
24 |
>> ArgumentParser interface that takes care of this for us ? |
25 |
>> -mike |
26 |
> |
27 |
> That sounds good. After it decodes the arguments, it should decode them |
28 |
> as UTF-8 with errors='strict', and exit the program immediately if it |
29 |
> triggers a UnicodeDecodeError. |
30 |
> |
31 |
|
32 |
I mean, after it _encodes_ them with surrogateescape, it should |
33 |
immediately decode them as UTF-8 with errors='strict'. That way, we'll |
34 |
have unicode strings to pass to argparse (we don't want to pass raw |
35 |
bytes to argparse). |
36 |
-- |
37 |
Thanks, |
38 |
Zac |