Gentoo Archives: gentoo-portage-dev

From: Brian Dolbec <dolsen@g.o>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] [PATCH] repoman: use regular expression to detect line continuations
Date: Wed, 22 Feb 2017 03:29:31
Message-Id: 20170221192926.1b324cff.dolsen@gentoo.org
In Reply to: [gentoo-portage-dev] [PATCH] repoman: use regular expression to detect line continuations by Zac Medico
1 On Tue, 21 Feb 2017 16:31:56 -0800
2 Zac Medico <zmedico@g.o> wrote:
3
4 > Use a regular expression to detect line continuations, instead
5 > of the unicode_escape codec, since the unicode_escape codec is
6 > not really intended to be used this way.
7 >
8 > This solves an issue with python3.6, where a DeprecationWarning
9 > is triggered by ebuilds containing escape sequences, like this
10 > warning triggered by a sed expression in the dev-db/sqlite
11 > ebuilds:
12 >
13 > DeprecationWarning: invalid escape sequence '\['
14 > ---
15 > repoman/pym/repoman/modules/scan/ebuild/checks.py | 28
16 > +++++++---------------- 1 file changed, 8 insertions(+), 20
17 > deletions(-)
18 >
19 > diff --git a/repoman/pym/repoman/modules/scan/ebuild/checks.py
20 > b/repoman/pym/repoman/modules/scan/ebuild/checks.py index
21 > 15e2251..d21bf0c 100644 ---
22 > a/repoman/pym/repoman/modules/scan/ebuild/checks.py +++
23 > b/repoman/pym/repoman/modules/scan/ebuild/checks.py @@ -8,8 +8,8 @@
24 > and correctness of an ebuild."""
25 > from __future__ import unicode_literals
26 >
27 > -import codecs
28 > from itertools import chain
29 > +import operator
30 > import re
31 > import time
32 >
33 > @@ -923,11 +923,10 @@ def checks_init(experimental_inherit=False):
34 >
35 > _here_doc_re = re.compile(r'.*<<[-]?(\w+)\s*(>\s*\S+\s*)?$')
36 > _ignore_comment_re = re.compile(r'^\s*#')
37 > +_continuation_re = re.compile(r'(\\)*$')
38 >
39 >
40 > def run_checks(contents, pkg):
41 > - unicode_escape_codec = codecs.lookup('unicode_escape')
42 > - unicode_escape = lambda x: unicode_escape_codec.decode(x)[0]
43 > if _constant_checks is None:
44 > checks_init()
45 > checks = _constant_checks
46 > @@ -957,32 +956,21 @@ def run_checks(contents, pkg):
47 > # cow
48 > # This will merge these lines like so:
49 > # inherit foo bar moo cow
50 > - try:
51 > - # A normal line will end in the two bytes:
52 > <\> <\n>. So decoding
53 > - # that will result in python thinking the
54 > <\n> is being escaped
55 > - # and eat the single <\> which makes it hard
56 > for us to detect.
57 > - # Instead, strip the newline (which we know
58 > all lines have), and
59 > - # append a <0>. Then when python escapes
60 > it, if the line ended
61 > - # in a <\>, we'll end up with a <\0> marker
62 > to key off of. This
63 > - # shouldn't be a problem with any valid
64 > ebuild ...
65 > - line_escaped =
66 > unicode_escape(line.rstrip('\n') + '0')
67 > - except SystemExit:
68 > - raise
69 > - except:
70 > - # Who knows what kind of crazy crap an
71 > ebuild will have
72 > - # in it -- don't allow it to kill us.
73 > - line_escaped = line
74 > + # A line ending with an even number of backslashes
75 > does not count,
76 > + # because the last backslash is escaped. Therefore,
77 > search for an
78 > + # odd number of backslashes.
79 > + line_escaped =
80 > operator.sub(*_continuation_re.search(line).span()) % 2 == 1 if
81 > multiline: # Chop off the \ and \n bytes from the previous line.
82 > multiline = multiline[:-2] + line
83 > - if not line_escaped.endswith('\0'):
84 > + if not line_escaped:
85 > line = multiline
86 > num = multinum
87 > multiline = None
88 > else:
89 > continue
90 > else:
91 > - if line_escaped.endswith('\0'):
92 > + if line_escaped:
93 > multinum = num
94 > multiline = line
95 > continue
96
97 Code seems fine to me, I trust you ;)
98
99 --
100 Brian Dolbec <dolsen>

Replies