Gentoo Archives: gentoo-portage-dev

From: Zac Medico <zmedico@g.o>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] [PATCH] xpak-helper: rewrite to rely more on argparse
Date: Mon, 02 Nov 2015 17:10:33
Message-Id: 563798FF.3040707@gentoo.org
In Reply to: Re: [gentoo-portage-dev] [PATCH] xpak-helper: rewrite to rely more on argparse by Zac Medico
1 On 11/02/2015 09:06 AM, Zac Medico wrote:
2 > On 11/02/2015 08:52 AM, Mike Frysinger wrote:
3 >> On 01 Nov 2015 09:36, Zac Medico wrote:
4 >>> In order to handle python3 with arguments containing UTF-8 characters
5 >>> (in ${PKGDIR}) and a mis-matched sys.getfilesystemencoding() value, it's
6 >>> safest to decode the arguments like chmod-lite.py does.
7 >>
8 >> it seems wrong that we have incomplete coverage here.
9 >> some tools do it and some do not.
10 >
11 > Yeah, complete coverage would be nice. Most if not all of the python
12 > helpers that are called from the ebuild environment already use this
13 > method to decode filename arguments (dohtml.py was only fixed recently,
14 > for bug 561846).
15 >
16 >>> We should create
17 >>> a function for this code which is also duplicated in install.py:
18 >>
19 >> you mean portage._decode_argv ?
20 >
21 > Yes, I forgot about that function.
22 >
23 >> what if we create a new module like "commandline" that provides an
24 >> ArgumentParser interface that takes care of this for us ?
25 >> -mike
26 >
27 > That sounds good. After it decodes the arguments, it should decode them
28 > as UTF-8 with errors='strict', and exit the program immediately if it
29 > triggers a UnicodeDecodeError.
30 >
31
32 I mean, after it _encodes_ them with surrogateescape, it should
33 immediately decode them as UTF-8 with errors='strict'. That way, we'll
34 have unicode strings to pass to argparse (we don't want to pass raw
35 bytes to argparse).
36 --
37 Thanks,
38 Zac