Gentoo Archives: gentoo-portage-dev

From: Zac Medico <zmedico@g.o>
To: gentoo-portage-dev@l.g.o, "Michał Górny" <mgorny@g.o>
Subject: Re: [gentoo-portage-dev] [PATCH 1/2] fetch: Use real os.walk() to avoid unicode issues with Portage
Date: Mon, 21 Oct 2019 10:07:29
Message-Id: f031900f-fda7-147e-5bea-e78f6eada966@gentoo.org
In Reply to: Re: [gentoo-portage-dev] [PATCH 1/2] fetch: Use real os.walk() to avoid unicode issues with Portage by "Michał Górny"
1 On 10/21/19 2:16 AM, Michał Górny wrote:
2 > On Mon, 2019-10-21 at 02:10 -0700, Zac Medico wrote:
3 >> On 10/21/19 1:43 AM, Michał Górny wrote:
4 >>> Use real os.walk() when getting filenames for FlatLayout. Unlike
5 >>> the wrapped Portage module, it return str output for str path parameter,
6 >>> so we don't have to recode it back and forth.
7 >>>
8 >>> Signed-off-by: Michał Górny <mgorny@g.o>
9 >>> ---
10 >>> lib/portage/package/ebuild/fetch.py | 3 ++-
11 >>> 1 file changed, 2 insertions(+), 1 deletion(-)
12 >>>
13 >>> diff --git a/lib/portage/package/ebuild/fetch.py b/lib/portage/package/ebuild/fetch.py
14 >>> index cedf12b19..be277f1a3 100644
15 >>> --- a/lib/portage/package/ebuild/fetch.py
16 >>> +++ b/lib/portage/package/ebuild/fetch.py
17 >>> @@ -11,6 +11,7 @@ import io
18 >>> import itertools
19 >>> import json
20 >>> import logging
21 >>> +import os as real_os
22 >>> import random
23 >>> import re
24 >>> import stat
25 >>> @@ -270,7 +271,7 @@ class FlatLayout(object):
26 >>> return filename
27 >>>
28 >>> def get_filenames(self, distdir):
29 >>> - for dirpath, dirnames, filenames in os.walk(distdir,
30 >>> + for dirpath, dirnames, filenames in real_os.walk(distdir,
31 >>> onerror=_raise_exc):
32 >>> return iter(filenames)
33 >>>
34 >>>
35 >>
36 >> The real_os.walk will trigger UnicodeEncodeError if distdir can't be
37 >> encoded with sys.getfilesystemencoding(). It's an edge case, but
38 >> generally I prefer to handle it.
39 >>
40 >> We can continue to use portage.os for the os.walk call, and turn
41 >> get_filenames into a generator method like this:
42 >>
43 >> for filename in filenames:
44 >> try:
45 >> yield portage._unicode_decode(filename, errors='strict')
46 >> except UnicodeDecodeError:
47 >> # Ignore it. Distfiles names must have valid UTF8 encoding.
48 >> pass
49 >
50 > Since you've already written it, could you commit it? I don't wish to
51 > have my name on the implicit module overrides hackery I don't approve
52 > of.
53
54 Done:
55
56 https://gitweb.gentoo.org/proj/portage.git/commit/?id=d9855418352398013ae787bb73f70e935ec109ca
57
58 I don't really like the portage.os unicode wrapper either, but I'm not
59 aware of a good alternative to solve the pervasive UnicodeEncodeError
60 issue that I've mentioned.
61 --
62 Thanks,
63 Zac

Attachments

File name MIME type
signature.asc application/pgp-signature