Gentoo Archives: gentoo-portage-dev

From: Sebastian Luther <SebastianLuther@×××.de>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909).
Date: Fri, 17 Jan 2014 08:36:03
Message-Id: 52D8EB6E.9070504@gmx.de
In Reply to: Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909). by Tom Wijsman
1 Am 16.01.2014 22:40, schrieb Tom Wijsman:
2 > On Thu, 16 Jan 2014 08:03:03 +0100 Sebastian Luther
3 > <SebastianLuther@×××.de> wrote:
4 >
5 >> Am 16.01.2014 01:07, schrieb Tom Wijsman:
6 >>> --- bin/repoman | 53
7 >>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
8 >>> man/repoman.1 | 4 ++++ 2 files changed, 57 insertions(+)
9 >>>
10 >>> diff --git a/bin/repoman b/bin/repoman index d1542e9..9b703dc
11 >>> 100755 --- a/bin/repoman +++ b/bin/repoman @@ -36,6 +36,9 @@
12 >>> pym_path =
13 >>> osp.join(osp.dirname(osp.dirname(osp.realpath(__file__))),
14 >>> "pym") sys.path.insert(0, pym_path) import portage
15 >>> portage._internal_caller = True + +from portage._sets.profiles
16 >>> import PackagesSystemSet +system_set_atoms =
17 >>> PackagesSystemSet(portage.settings.profiles).getAtoms()
18 >>> portage._disable_legacy_globals()
19 >>
20 >> You should be using repoman_settings instead of
21 >> portage.settings.
22 >
23 > If I understand correctly, that is this URL?
24 >
25 > http://dev.gentoo.org/~zmedico/portage/doc/api/portage.repository.config-module.html
26 >
27 > How do I get the @system set out of that?
28
29 portage.settings and repoman_settings are instances of
30 portage.package.ebuild.config.config. I just looked at the code and
31 think now that you should keep using PackagesSystemSet to get the
32 @system atoms. Just turn them into a set afterwards with set(atom.cp
33 for atom in system_set_atoms).
34
35 The reason to use atom.cp is that >=cat/foo-1 could be part of that
36 set and then '"cat/foo" in system_set_atoms' would return False.
37
38 >
39 >> Considering the later use
40 >
41 > Which use?
42
43 The "if entry not in system_set_atoms" line. You're using __contains__
44 there (with the 'in'). You don't use the additional magic provided by
45 PackageSet (which is a super class of PackagesSystemSet).
46
47 >
48 >> you don't need PackagesSystemSet set here, just use a set.
49 >
50 > Okay, thus I need to create some kind of set object here (I don't
51 > see one in the list of
52 > http://dev.gentoo.org/~zmedico/portage/doc/api/ though) and then
53 > specify that it would be the @system set? Which class?
54 >
55
56 Whenever I say 'set' I mean python's builtin set.
57
58 >> And use atom.cp instead of the atoms.
59 >
60 > So, if I understood correctly; using list comprehension, I
61 > directly transform the getAtoms() to a list of atom.cp's... Okay,
62 > good idea.
63 >
64 >>> try: @@ -300,6 +303,7 @@ qahelp = { "inherit.missing": "Ebuild
65 >>> uses functions from an eclass but does not inherit it",
66 >>> "inherit.unused": "Ebuild inherits an eclass but does not use
67 >>> it", "java.eclassesnotused": "With virtual/jdk in DEPEND you
68 >>> must inherit a java eclass", + "unpack.DEPEND.missing": "A rare
69 >>> archive format was used in SRC_URI, but its package to unpack
70 >>> it is missing in DEPEND.", "wxwidgets.eclassnotused": "Ebuild
71 >>> DEPENDs on x11-libs/wxGTK without inheriting wxwidgets.eclass",
72 >>> "KEYWORDS.dropped": "Ebuilds that appear to have dropped
73 >>> KEYWORDS for some arch", "KEYWORDS.missing": "Ebuilds that have
74 >>> a missing or empty KEYWORDS variable", @@ -399,6 +403,7 @@
75 >>> qawarnings = set(( "metadata.warning", "portage.internal",
76 >>> "repo.eapi.deprecated", +"unpack.DEPEND.missing",
77 >>> "usage.obsolete", "upstream.workaround", "LIVEVCS.stable", @@
78 >>> -479,6 +484,25 @@ ruby_deprecated = frozenset([
79 >>> "ruby_targets_ree18", ])
80 >>>
81 >>> +# TODO: Add functionality to support checking for deb2targz
82 >>> on platforms where +# GNU binutils is absent; see PMS 5,
83 >>> section 11.3.3.13. +archive_formats = { +
84 >>> "\.7[zZ]":"app-arch/p7zip", +
85 >>> "\.(bz2?|tbz2)":"app-arch/bzip2", + "\.jar":"app-arch/unzip", +
86 >>> "\.(LH[aA]|lha|lzh)":"app-arch/lha", +
87 >>> "\.lzma":"app-arch/lzma-utils", +
88 >>> "\.(rar|RAR)":"app-arch/unrar", +
89 >>> "\.(tar(\.(bz2?|gz|Z))?|tbz2|t[bg]z)?":"app-arch/tar", +
90 >>> "\.(gz|tar\.Z|t[bg]z|[zZ])":"app-arch/gzip", +
91 >>> "\.(zip|ZIP)":"app-arch/unzip", +} +
92 >>> +archive_formats_eapi_3_to_5 = { + "\.tar.xz":"app-arch/tar", +
93 >>> "\.xz":"app-arch/xz-utils", +} + metadata_xml_encoding =
94 >>> 'UTF-8' metadata_xml_declaration = '<?xml version="1.0"
95 >>> encoding="%s"?>' % \ (metadata_xml_encoding,) @@ -1559,6
96 >>> +1583,7 @@ for x in effective_scanlist: fetchlist_dict =
97 >>> portage.FetchlistDict(checkdir, repoman_settings, portdb)
98 >>> myfiles_all = [] src_uri_error = False + needed_unpack_depends
99 >>> = {} for mykey in fetchlist_dict: try:
100 >>> myfiles_all.extend(fetchlist_dict[mykey]) @@ -1573,7 +1598,22
101 >>> @@ for x in effective_scanlist: stats["SRC_URI.syntax"] += 1
102 >>> fails["SRC_URI.syntax"].append( "%s.ebuild SRC_URI: %s" %
103 >>> (mykey, e)) + + # Compare each SRC_URI entry against
104 >>> archive_formats; if one of the + # extensions match, we
105 >>> remember which archive depends are needed to + # check them
106 >>> later on. + needed_unpack_depends[mykey] = [] + for
107 >>> file_extension in archive_formats or \ + ((re.match('[345]$',
108 >>> eapi) is not None) \
109 >>
110 >> Use portage.eapi for the line above.
111 >
112 > Why? 'eapi' is the EAPI of the ebuild, what is wrong with that?
113
114 What I want you to do is to change "(re.match('[345]$', eapi)" into
115 something like: "eapi_has_xz_unpack(eapi)". the function
116 eapi_has_xz_unpack needs to be written. It should be part of portage.eapi.
117
118 >
119 >> You may have to add a new function to portage.eapi.
120 >
121 > What would the purpose of that function be?
122 >
123 >>> + and file_extension in archive_formats_eapi_3_to_5): +
124 >>> for entry in fetchlist_dict[mykey]: + if re.match('.*%s$' %
125 >>> file_extension, entry) is not None: + format =
126 >>> archive_formats[file_extension]
127 >>
128 >> As these regex are used frequently, they should be compiled
129 >> using re.compile.
130 >
131 > I know, but it contains %s; but, I'll look if I can make a list of
132 > regex, one for each file extension. Or rather, I'll first try to
133 > instead match the last characters of the string using a substring
134 > without having to create a regex at all, which should be even
135 > faster.
136 >
137 >>> + if format not in needed_unpack_depends[mykey]: +
138 >>> needed_unpack_depends[mykey].append(format)
139 >>
140 >> I'd make needed_unpack_depends[mykey] a set. Then you can just
141 >> add() instead of checking and appending.
142 >
143 > Thanks for the suggestion, I'll look into this.
144 >
145 >>> del fetchlist_dict + if not src_uri_error: # This test can
146 >>> produce false positives if SRC_URI could not # be parsed for
147 >>> one or more ebuilds. There's no point in @@ -2010,6 +2050,17 @@
148 >>> for x in effective_scanlist: atoms = None
149 >>> badsyntax.append(str(e))
150 >>>
151 >>> + if atoms and mytype == 'DEPEND':
152 >>
153 >> Use "if atoms and buildtime:" here.
154 >
155 > +1
156 >
157 >>> + # We check whether the needed archive dependencies are
158 >>> present + # in DEPEND, which were determined from SRC_URI. +
159 >>> for entry in needed_unpack_depends[catdir + '/' + y]:
160 >>
161 >> Use the existing catpkg here.
162 >
163 > Missed that, thank you.
164 >
165 >>> + if entry not in system_set_atoms and entry \ + not
166 >>> in [atom.cp for atom in atoms if atom != "||"]: +
167 >>> stats['unpack.' + mytype + '.missing'] += 1 +
168 >>> fails['unpack.' + mytype + '.missing'].append( \ +
169 >>> relative_path + ": %s is missing in %s" % \ + (entry,
170 >>> mytype)) + if atoms and mytype.endswith("DEPEND"): if runtime
171 >>> and \ "test?" in mydepstr.split(): @@ -2384,6 +2435,8 @@ for x
172 >>> in effective_scanlist: "%s/metadata.xml: unused local
173 >>> USE-description: '%s'" % \ (x, myflag))
174 >>>
175 >>> + del needed_unpack_depends + if options.if_modified == "y" and
176 >>> len(effective_scanlist) < 1: logging.warn("--if-modified is
177 >>> enabled, but no modified packages were found!") diff --git
178 >>> a/man/repoman.1 b/man/repoman.1 index a78f94e..e739d56 100644
179 >>> --- a/man/repoman.1 +++ b/man/repoman.1 @@ -334,6 +334,10 @@
180 >>> Ebuild inherits a deprecated eclass With virtual/jdk in DEPEND
181 >>> you must inherit a java eclass. Refer to
182 >>> \fIhttp://www.gentoo.org/proj/en/java/java\-devel.xml\fR for
183 >>> more information. .TP +.B unpack.DEPEND.missing +A rare archive
184 >>> format was used in SRC_URI, but its package to unpack it is
185 >> ^^^ the(?)
186 >
187 > Unsure myself as well, but yes; the is the safe option here.
188 >
189 >>> +missing in DEPEND.
190 >> ^^ from(?)
191 >
192 > Yes, 'in action' or 'from something'; thus 'from'. Thanks.
193 >
194 >>> +TP .B manifest.bad Manifest has missing or incorrect digests
195 >>> .TP
196 >>>
197 >>
198 >> Maybe you could remove the entries from the archive_formats
199 >> variable once you know if they are in the system set.
200 >
201 > The purpose here is to allow to support changes in the system set;
202 > when something is added or present in the system set, it doesn't
203 > necessarily imply that it will stay. Keeping them listed foresees
204 > that a format could become deprecated or less used in the future.
205 >
206 I didn't mean to remove it from the hardcoded list, but to remove it
207 inside the code once you know the contents of @system.

Replies