Gentoo Archives: gentoo-commits

From: Michael Orlitzky <mjo@g.o>
To: gentoo-commits@l.g.o
Subject: [gentoo-commits] repo/gentoo:master commit in: dev-python/pyzor/files/
Date: Sat, 24 Mar 2018 16:13:56
Message-Id: 1521907943.300f46c1c52cc79238e88dba28241a6e78525966.mjo@gentoo
1 commit: 300f46c1c52cc79238e88dba28241a6e78525966
2 Author: Michael Orlitzky <mjo <AT> gentoo <DOT> org>
3 AuthorDate: Sat Mar 24 16:12:23 2018 +0000
4 Commit: Michael Orlitzky <mjo <AT> gentoo <DOT> org>
5 CommitDate: Sat Mar 24 16:12:23 2018 +0000
6 URL: https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=300f46c1
7
8 dev-python/pyzor: fix the binary stdin patch to work with v1.0.0.
9
10 In my previous commit (adding -r1), I applied a new patch that I've
11 submitted upstream to address a unicode crash with python-3.x. That
12 patch applies cleanly against v1.0.0, but won't actually work: the
13 get_binary_stdin() function it uses exists only in upstream's git
14 master branch.
15
16 To make the patch work (and to fix some other small issues), I've
17 included the rest of the client changes between v1.0.0 and git
18 master. There are very few of them -- all python-3.x fixes -- so this
19 should not be too objectionable.
20
21 Bug: https://bugs.gentoo.org/643692
22 Package-Manager: Portage-2.3.24, Repoman-2.3.6
23
24 .../read-stdin-as-binary-in-get_input_msg.patch | 94 +++++++++++++++-------
25 1 file changed, 67 insertions(+), 27 deletions(-)
26
27 diff --git a/dev-python/pyzor/files/read-stdin-as-binary-in-get_input_msg.patch b/dev-python/pyzor/files/read-stdin-as-binary-in-get_input_msg.patch
28 index 81668e36937..03031a97669 100644
29 --- a/dev-python/pyzor/files/read-stdin-as-binary-in-get_input_msg.patch
30 +++ b/dev-python/pyzor/files/read-stdin-as-binary-in-get_input_msg.patch
31 @@ -1,45 +1,85 @@
32 -From 6332a429ed415187599ecce7d8a169ee19f0bbe5 Mon Sep 17 00:00:00 2001
33 +From 66225b32d2774cf37fa7f702f7eb26cd94094482 Mon Sep 17 00:00:00 2001
34 From: Michael Orlitzky <michael@××××××××.com>
35 -Date: Sun, 4 Mar 2018 17:34:33 -0500
36 -Subject: [PATCH 1/1] scripts/pyzor: read stdin as binary in _get_input_msg().
37 +Date: Sun, 4 Mar 2018 17:27:01 -0500
38 +Subject: [PATCH 1/1] scripts/pyzor: replace the client with the git (+ issue
39 + 64 fix) version.
40
41 -Reading stdin in python-3.x is done as text, with a best-guess
42 -encoding. But this can go awry: for example, if an iso-8859-1 message
43 -is passed in and if python guesses the "utf-8" encoding, then read()
44 -will fail with a UnicodeDecodeError on non-ASCII characters. For
45 -example, the "copyright" symbol is a single byte 0xa9 in iso-8859-1,
46 -and the utf-8 decoder can't handle it:
47 -
48 - UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa9... invalid
49 - start byte
50 -
51 -Instead -- and as was done in python-2.x -- we can read stdin as
52 -binary using the new get_binary_stdin() function. Afterwards, we use
53 -email.message_from_bytes() instead of the email.message_from_file()
54 -constructor to parse the byte data. The resulting function is able to
55 -correctly parse these messages.
56 -
57 -Closes: https://github.com/SpamExperts/pyzor/issues/64
58 ---
59 - scripts/pyzor | 5 ++++-
60 - 1 file changed, 4 insertions(+), 1 deletion(-)
61 + scripts/pyzor | 33 +++++++++++++++++++++++++++------
62 + 1 file changed, 27 insertions(+), 6 deletions(-)
63
64 diff --git a/scripts/pyzor b/scripts/pyzor
65 -index 567a7f9..1ba632f 100755
66 +index 19b1d21..86c6f7d 100755
67 --- a/scripts/pyzor
68 +++ b/scripts/pyzor
69 -@@ -171,7 +171,10 @@ def _get_input_digests(dummy):
70 +@@ -17,9 +17,9 @@ import tempfile
71 + import threading
72 +
73 + try:
74 +- import ConfigParser
75 +-except ImportError:
76 + import configparser as ConfigParser
77 ++except ImportError:
78 ++ import ConfigParser
79 +
80 + import pyzor.digest
81 + import pyzor.client
82 +@@ -110,7 +110,7 @@ def load_configuration():
83 + config = ConfigParser.ConfigParser()
84 + # Set the defaults.
85 + config.add_section("client")
86 +- for key, value in defaults.iteritems():
87 ++ for key, value in defaults.items():
88 + config.set("client", key, value)
89 + # Override with the configuration.
90 + config.read(os.path.join(options.homedir, "config"))
91 +@@ -171,14 +171,35 @@ def _get_input_digests(dummy):
92
93
94 def _get_input_msg(digester):
95 - msg = email.message_from_file(sys.stdin)
96 -+ # Read and process stdin as bytes because we don't know its
97 -+ # encoding. Python-3.x will try to guess -- and can sometimes
98 -+ # guess wrong -- leading to decoding errors in read().
99 + msg = email.message_from_bytes(get_binary_stdin().read())
100 digested = digester(msg).value
101 yield digested
102
103 +
104 ++def _is_binary_reader(stream, default=False):
105 ++ try:
106 ++ return isinstance(stream.read(0), bytes)
107 ++ except Exception:
108 ++ return default
109 ++
110 ++
111 ++def get_binary_stdin():
112 ++ # sys.stdin might or might not be binary in some extra cases. By
113 ++ # default it's obviously non binary which is the core of the
114 ++ # problem but the docs recommend changing it to binary for such
115 ++ # cases so we need to deal with it.
116 ++ is_binary = _is_binary_reader(sys.stdin, False)
117 ++ if is_binary:
118 ++ return sys.stdin
119 ++ buf = getattr(sys.stdin, 'buffer', None)
120 ++ if buf is not None and _is_binary_reader(buf, True):
121 ++ return buf
122 ++ raise RuntimeError('Did not manage to get binary stdin')
123 ++
124 ++
125 + def _get_input_mbox(digester):
126 + tfile = tempfile.NamedTemporaryFile()
127 +- tfile.write(sys.stdin.read().encode("utf8"))
128 ++ tfile.write(get_binary_stdin().read())
129 + tfile.seek(0)
130 + mbox = mailbox.mbox(tfile.name)
131 + for msg in mbox:
132 +@@ -372,7 +393,7 @@ def genkey(client, servers, config, hash_func=hashlib.sha1):
133 + return False
134 + # pylint: disable-msg=W0612
135 + salt = "".join([chr(random.randint(0, 255))
136 +- for unused in xrange(hash_func(b"").digest_size)])
137 ++ for unused in range(hash_func(b"").digest_size)])
138 + if sys.version_info >= (3, 0):
139 + salt = salt.encode("utf8")
140 + salt_digest = hash_func(salt)
141 --
142 2.13.6