Gentoo Archives: gentoo-portage-dev

From: "Michał Górny" <mgorny@g.o>
To: Zac Medico <zmedico@g.o>
Cc: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] [PATCH] Generate soname dependency metadata (282639)
Date: Fri, 30 Jan 2015 05:01:13
Message-Id: 20150130060049.5535277f@pomiot.lan
In Reply to: [gentoo-portage-dev] [PATCH] Generate soname dependency metadata (282639) by Zac Medico
1 Dnia 2015-01-26, o godz. 19:16:28
2 Zac Medico <zmedico@g.o> napisał(a):
3
4 > Generate soname dependency metadata for binary and installed packages,
5 > in the form of PROVIDES and REQUIRES metadata. It is useful to generate
6 > PROVIDES and REQUIRES metadata now, so that it will be available
7 > when dependency resolver support is added in the future. Note that
8 > slot-operator dependencies will not be able to serve as a substitute
9 > for soname dependencies for the forseeable future, because system
10 > dependencies are frequently unspecified (according to Gentoo policy).
11 >
12 > The PROVIDES/REQUIRES system is very similar to the automatic Requires
13 > and Provides system which is supported by RPM. The PROVIDES/REQUIRES
14 > metadata is generated automatically from the ELF files that are
15 > installed by a package. The PROVIDES/REQUIRES syntax is described in
16 > the /var/db/pkg section of the portage(5) man page. REQUIRES_EXCLUDE
17 > and PROVIDES_EXCLUDE ebuild variables allow for filtering of the
18 > sonames that are saved in REQUIRES and PROVIDES (see the ebuild(5) man
19 > page for details).
20 >
21 > The /var/db/pkg NEEDED.ELF.2 format now includes an additional field
22 > which indicates the multilib category, as discussed in bug #534206. The
23 > multilib category is used to categorize the sonames that are listed in
24 > PROVIDES/REQUIRES metadata, since sonames need to be resolved
25 > separately for each multilib category. The complete list of supported
26 > multilib categories is documented in the comments of the
27 > portage.dep.soname.multilib_category module.
28 >
29 > X-Gentoo-Bug: 282639
30 > X-Gentoo-Bug-URL: https://bugs.gentoo.org/show_bug.cgi?id=282639
31 > ---
32 > bin/ebuild.sh | 2 +-
33 > bin/phase-functions.sh | 2 +-
34 > man/ebuild.5 | 12 +++
35 > man/portage.5 | 25 +++++
36 > pym/_emerge/Package.py | 3 +-
37 > pym/portage/dbapi/bintree.py | 5 +-
38 > pym/portage/dbapi/vartree.py | 1 +
39 > pym/portage/dep/soname/__init__.py | 2 +
40 > pym/portage/dep/soname/multilib_category.py | 112 +++++++++++++++++++++++
41 > pym/portage/package/ebuild/doebuild.py | 86 ++++++++++++++++--
42 > pym/portage/util/_dyn_libs/LinkageMapELF.py | 61 ++++++++++---
43 > pym/portage/util/_dyn_libs/NeededEntry.py | 83 +++++++++++++++++
44 > pym/portage/util/_dyn_libs/soname_deps.py | 136 ++++++++++++++++++++++++++++
45 > pym/portage/util/elf/__init__.py | 2 +
46 > pym/portage/util/elf/constants.py | 36 ++++++++
47 > pym/portage/util/elf/header.py | 62 +++++++++++++
48 > pym/portage/util/endian/__init__.py | 2 +
49 > pym/portage/util/endian/decode.py | 56 ++++++++++++
50 > 18 files changed, 661 insertions(+), 27 deletions(-)
51 > create mode 100644 pym/portage/dep/soname/__init__.py
52 > create mode 100644 pym/portage/dep/soname/multilib_category.py
53 > create mode 100644 pym/portage/util/_dyn_libs/NeededEntry.py
54 > create mode 100644 pym/portage/util/_dyn_libs/soname_deps.py
55 > create mode 100644 pym/portage/util/elf/__init__.py
56 > create mode 100644 pym/portage/util/elf/constants.py
57 > create mode 100644 pym/portage/util/elf/header.py
58 > create mode 100644 pym/portage/util/endian/__init__.py
59 > create mode 100644 pym/portage/util/endian/decode.py
60 >
61 > diff --git a/bin/ebuild.sh b/bin/ebuild.sh
62 > index e6f9cb9..b6b3723 100755
63 > --- a/bin/ebuild.sh
64 > +++ b/bin/ebuild.sh
65 > @@ -578,7 +578,7 @@ if ! has "$EBUILD_PHASE" clean cleanrm ; then
66 > # interaction begins.
67 > unset EAPI DEPEND RDEPEND PDEPEND HDEPEND INHERITED IUSE REQUIRED_USE \
68 > ECLASS E_IUSE E_REQUIRED_USE E_DEPEND E_RDEPEND E_PDEPEND \
69 > - E_HDEPEND
70 > + E_HDEPEND PROVIDES_EXCLUDE REQUIRES_EXCLUDE
71 >
72 > if [[ $PORTAGE_DEBUG != 1 || ${-/x/} != $- ]] ; then
73 > source "$EBUILD" || die "error sourcing ebuild"
74 > diff --git a/bin/phase-functions.sh b/bin/phase-functions.sh
75 > index aec86fd..def2080 100644
76 > --- a/bin/phase-functions.sh
77 > +++ b/bin/phase-functions.sh
78 > @@ -580,7 +580,7 @@ __dyn_install() {
79 > for f in ASFLAGS CBUILD CC CFLAGS CHOST CTARGET CXX \
80 > CXXFLAGS EXTRA_ECONF EXTRA_EINSTALL EXTRA_MAKE \
81 > LDFLAGS LIBCFLAGS LIBCXXFLAGS QA_CONFIGURE_OPTIONS \
82 > - QA_DESKTOP_FILE ; do
83 > + QA_DESKTOP_FILE PROVIDES_EXCLUDE REQUIRES_EXCLUDE ; do
84 > x=$(echo -n ${!f})
85 > [[ -n $x ]] && echo "$x" > $f
86 > done
87 > diff --git a/man/ebuild.5 b/man/ebuild.5
88 > index b587264..c2cbe4b 100644
89 > --- a/man/ebuild.5
90 > +++ b/man/ebuild.5
91 > @@ -480,6 +480,12 @@ source source\-build which is scheduled for merge
92 > .TE
93 > .RE
94 > .TP
95 > +.B PROVIDES_EXCLUDE\fR = \fI[space delimited list of fnmatch patterns]\fR
96 > +Sonames and file paths matched by these fnmatch patterns will be
97 > +excluded during genertion of \fBPROVIDES\fR metadata (see
98 > +\fBportage\fR(5)). Patterns are delimited by whitespace, and it is
99 > +possible to create patterns containing quoted whitespace.
100 > +.TP
101 > .B PORTAGE_LOG_FILE
102 > Contains the path of the build log. If \fBPORT_LOGDIR\fR variable is unset then
103 > PORTAGE_LOG_FILE=\fI"${T}/build.log"\fR.
104 > @@ -501,6 +507,12 @@ to the package version(s) being replaced. Typically, this variable will
105 > not contain more than one version, but according to PMS it can contain
106 > more.
107 > .TP
108 > +.B REQUIRES_EXCLUDE\fR = \fI[space delimited list of fnmatch patterns]\fR
109 > +Sonames and file paths matched by these fnmatch patterns will be
110 > +excluded during generation of \fBREQUIRES\fR metadata (see
111 > +\fBportage\fR(5)). Patterns are delimited by whitespace, and it is
112 > +possible to create patterns containing quoted whitespace.
113 > +.TP
114 > .B ROOT\fR = \fI"/"
115 > Contains the path that portage should use as the root of the live filesystem.
116 > When packages wish to make changes to the live filesystem, they should do so in
117 > diff --git a/man/portage.5 b/man/portage.5
118 > index 189561c..bf159fd 100644
119 > --- a/man/portage.5
120 > +++ b/man/portage.5
121 > @@ -1443,6 +1443,31 @@ can be changed quickly. Generally though there is one file per environment
122 > variable that "matters" (like CFLAGS) with the contents stored inside of it.
123 > Another common file is the CONTENTS file which lists the path and hashes of
124 > all objects that the package installed onto your system.
125 > +.TP
126 > +.BR PROVIDES
127 > +Contains information about the sonames that a package provides, which is
128 > +automatically generated from the files that it installs. The sonames
129 > +may have been filtered by the \fBPROVIDES_EXCLUDE\fR \fBebuild\fR(5)
130 > +variable. A multilib category, followed by a colon, always preceeds a
131 > +list of one or more sonames.
132 > +
133 > +.I Example:
134 > +.nf
135 > +x86_32: libcom_err.so.2 libss.so.2 x86_64: libcom_err.so.2 libss.so.2
136 > +.fi
137 > +.TP
138 > +.BR REQUIRES
139 > +Contains information about the sonames that a package requires, which is
140 > +automatically generated from the files that it installs. The sonames
141 > +may have been filtered by the \fBREQUIRES_EXCLUDE\fR \fBebuild\fR(5)
142 > +variable. Any sonames that a package provides are automatically excluded
143 > +from \fBREQUIRES\fR. A multilib category, followed by a colon, always
144 > +preceeds a list of one or more sonames.
145 > +
146 > +.I Example:
147 > +.nf
148 > +x86_32: ld-linux.so.2 libc.so.6 x86_64: ld-linux-x86-64.so.2 libc.so.6
149 > +.fi
150 > .RE
151 > .TP
152 > .BR /var/lib/portage/
153 > diff --git a/pym/_emerge/Package.py b/pym/_emerge/Package.py
154 > index 8612e8b..518dbf6 100644
155 > --- a/pym/_emerge/Package.py
156 > +++ b/pym/_emerge/Package.py
157 > @@ -43,7 +43,8 @@ class Package(Task):
158 > "HDEPEND", "INHERITED", "IUSE", "KEYWORDS",
159 > "LICENSE", "PDEPEND", "PROVIDE", "RDEPEND",
160 > "repository", "PROPERTIES", "RESTRICT", "SLOT", "USE",
161 > - "_mtime_", "DEFINED_PHASES", "REQUIRED_USE"]
162 > + "_mtime_", "DEFINED_PHASES", "REQUIRED_USE", "PROVIDES",
163 > + "REQUIRES"]
164 >
165 > _dep_keys = ('DEPEND', 'HDEPEND', 'PDEPEND', 'RDEPEND')
166 > _buildtime_keys = ('DEPEND', 'HDEPEND')
167 > diff --git a/pym/portage/dbapi/bintree.py b/pym/portage/dbapi/bintree.py
168 > index 1156b66..583e208 100644
169 > --- a/pym/portage/dbapi/bintree.py
170 > +++ b/pym/portage/dbapi/bintree.py
171 > @@ -81,7 +81,8 @@ class bindbapi(fakedbapi):
172 > ["BUILD_TIME", "CHOST", "DEPEND", "EAPI",
173 > "HDEPEND", "IUSE", "KEYWORDS",
174 > "LICENSE", "PDEPEND", "PROPERTIES", "PROVIDE",
175 > - "RDEPEND", "repository", "RESTRICT", "SLOT", "USE", "DEFINED_PHASES"
176 > + "RDEPEND", "repository", "RESTRICT", "SLOT", "USE",
177 > + "DEFINED_PHASES", "PROVIDES", "REQUIRES"
178 > ])
179 > self._aux_cache_slot_dict = slot_dict_class(self._aux_cache_keys)
180 > self._aux_cache = {}
181 > @@ -322,7 +323,7 @@ class binarytree(object):
182 > ["BUILD_TIME", "CHOST", "DEPEND", "DESCRIPTION", "EAPI",
183 > "HDEPEND", "IUSE", "KEYWORDS", "LICENSE", "PDEPEND", "PROPERTIES",
184 > "PROVIDE", "RESTRICT", "RDEPEND", "repository", "SLOT", "USE", "DEFINED_PHASES",
185 > - "BASE_URI"]
186 > + "BASE_URI", "PROVIDES", "REQUIRES"]
187 > self._pkgindex_aux_keys = list(self._pkgindex_aux_keys)
188 > self._pkgindex_use_evaluated_keys = \
189 > ("DEPEND", "HDEPEND", "LICENSE", "RDEPEND",
190 > diff --git a/pym/portage/dbapi/vartree.py b/pym/portage/dbapi/vartree.py
191 > index 2d4d32d..cf31c8e 100644
192 > --- a/pym/portage/dbapi/vartree.py
193 > +++ b/pym/portage/dbapi/vartree.py
194 > @@ -176,6 +176,7 @@ class vardbapi(dbapi):
195 > "EAPI", "HDEPEND", "HOMEPAGE", "IUSE", "KEYWORDS",
196 > "LICENSE", "PDEPEND", "PROPERTIES", "PROVIDE", "RDEPEND",
197 > "repository", "RESTRICT" , "SLOT", "USE", "DEFINED_PHASES",
198 > + "PROVIDES", "REQUIRES"
199 > ])
200 > self._aux_cache_obj = None
201 > self._aux_cache_filename = os.path.join(self._eroot,
202 > diff --git a/pym/portage/dep/soname/__init__.py b/pym/portage/dep/soname/__init__.py
203 > new file mode 100644
204 > index 0000000..4725d33
205 > --- /dev/null
206 > +++ b/pym/portage/dep/soname/__init__.py
207 > @@ -0,0 +1,2 @@
208 > +# Copyright 2015 Gentoo Foundation
209 > +# Distributed under the terms of the GNU General Public License v2
210 > diff --git a/pym/portage/dep/soname/multilib_category.py b/pym/portage/dep/soname/multilib_category.py
211 > new file mode 100644
212 > index 0000000..8cc8fd3
213 > --- /dev/null
214 > +++ b/pym/portage/dep/soname/multilib_category.py
215 > @@ -0,0 +1,112 @@
216 > +# Copyright 2015 Gentoo Foundation
217 > +# Distributed under the terms of the GNU General Public License v2
218 > +#
219 > +# Compute a multilib category, as discussed here:
220 > +#
221 > +# https://bugs.gentoo.org/show_bug.cgi?id=534206
222 > +#
223 > +# Supported categories:
224 > +#
225 > +# alpha_{32,64}
226 > +# arm_{32,64}
227 > +# hppa_{32,64}
228 > +# ia_{32,64}
229 > +# m68k_{32,64}
230 > +# mips_{eabi32,eabi64,n32,n64,o32,o64}
231 > +# ppc_{32,64}
232 > +# s390_{32,64}
233 > +# sh_{32,64}
234 > +# sparc_{32,64}
235 > +# x86_{32,64,x32}
236 > +#
237 > +# NOTES:
238 > +#
239 > +# * The ABIs referenced by some of the above *_32 and *_64 categories
240 > +# may be imaginary, but they are listed anyway, since the goal is to
241 > +# establish a naming convention that is as consistent and uniform as
242 > +# possible.
243 > +#
244 > +# * The Elf header's e_ident[EI_OSABI] byte is completely ignored,
245 > +# since OS-independence is one of the goals. The assumption is that,
246 > +# for given installation, we are only interested in tracking multilib
247 > +# ABIs for a single OS.
248 > +
249 > +from ...util.elf.constants import (
250
251 Please do not use relative imports. Almost all code is using absolute
252 imports so if we're going to change that, we should get a proper
253 discussion first.
254
255 > + EF_MIPS_ABI, EF_MIPS_ABI2, ELFCLASS32, ELFCLASS64,
256 > + EM_386, EM_68K, EM_AARCH64, EM_ALPHA, EM_ARM, EM_IA_64, EM_MIPS,
257 > + EM_PARISC, EM_PPC, EM_PPC64, EM_S390, EM_SH, EM_SPARC,
258 > + EM_SPARC32PLUS, EM_SPARCV9, EM_X86_64, E_MIPS_ABI_EABI32,
259 > + E_MIPS_ABI_EABI64, E_MIPS_ABI_O32, E_MIPS_ABI_O64)
260 > +
261 > +_machine_prefix_map = {
262 > + EM_386: "x86",
263 > + EM_68K: "m68k",
264 > + EM_AARCH64: "arm",
265 > + EM_ALPHA: "alpha",
266 > + EM_ARM: "arm",
267 > + EM_IA_64: "ia",
268 > + EM_MIPS: "mips",
269 > + EM_PARISC: "hppa",
270 > + EM_PPC: "ppc",
271 > + EM_PPC64: "ppc",
272 > + EM_S390: "s390",
273 > + EM_SH: "sh",
274 > + EM_SPARC: "sparc",
275 > + EM_SPARC32PLUS: "sparc",
276 > + EM_SPARCV9: "sparc",
277 > + EM_X86_64: "x86",
278 > +}
279 > +
280 > +_mips_abi_map = {
281 > + E_MIPS_ABI_EABI32: "eabi32",
282 > + E_MIPS_ABI_EABI64: "eabi64",
283 > + E_MIPS_ABI_O32: "o32",
284 > + E_MIPS_ABI_O64: "o64",
285 > +}
286 > +
287 > +def _compute_suffix_mips(elf_header):
288 > +
289 > + name = None
290 > + mips_abi = elf_header.e_flags & EF_MIPS_ABI
291 > +
292 > + if mips_abi:
293 > + name = _mips_abi_map.get(mips_abi)
294 > + elif elf_header.e_flags & EF_MIPS_ABI2:
295 > + name = "n32"
296 > + elif elf_header.ei_class == ELFCLASS64:
297 > + name = "n64"
298 > +
299 > + return name
300 > +
301 > +def compute_multilib_category(elf_header):
302 > + """
303 > + Compute a multilib category from an ELF header.
304 > +
305 > + @param elf_header: an ELFHeader instance
306 > + @type elf_header: ELFHeader
307 > + @rtype: str
308 > + @return: A multilib category, or None if elf_header does not fit
309 > + into a recognized category
310 > + """
311 > + category = None
312 > + if elf_header.e_machine is not None:
313 > +
314 > + prefix = _machine_prefix_map.get(elf_header.e_machine)
315 > + suffix = None
316 > +
317 > + if prefix == "mips":
318 > + suffix = _compute_suffix_mips(elf_header)
319 > + elif elf_header.ei_class == ELFCLASS64:
320 > + suffix = "64"
321 > + elif elf_header.ei_class == ELFCLASS32:
322 > + if elf_header.e_machine == EM_X86_64:
323 > + suffix = "x32"
324 > + else:
325 > + suffix = "32"
326 > +
327 > + if prefix is None or suffix is None:
328 > + category = None
329 > + else:
330 > + category = "%s_%s" % (prefix, suffix)
331 > +
332 > + return category
333 > diff --git a/pym/portage/package/ebuild/doebuild.py b/pym/portage/package/ebuild/doebuild.py
334 > index 791b5c3..8bc2009 100644
335 > --- a/pym/portage/package/ebuild/doebuild.py
336 > +++ b/pym/portage/package/ebuild/doebuild.py
337 > @@ -33,7 +33,11 @@ portage.proxy.lazyimport.lazyimport(globals(),
338 > 'portage.package.ebuild._ipc.QueryCommand:QueryCommand',
339 > 'portage.dep._slot_operator:evaluate_slot_operator_equal_deps',
340 > 'portage.package.ebuild._spawn_nofetch:spawn_nofetch',
341 > + 'portage.util.elf.header:ELFHeader',
342 > + 'portage.dep.soname.multilib_category:compute_multilib_category',
343 > 'portage.util._desktop_entry:validate_desktop_entry',
344 > + 'portage.util._dyn_libs.NeededEntry:NeededEntry',
345 > + 'portage.util._dyn_libs.soname_deps:SonameDepsProcessor',
346 > 'portage.util._async.SchedulerInterface:SchedulerInterface',
347 > 'portage.util._eventloop.EventLoop:EventLoop',
348 > 'portage.util._eventloop.global_event_loop:global_event_loop',
349 > @@ -57,9 +61,9 @@ from portage.eapi import eapi_exports_KV, eapi_exports_merge_type, \
350 > eapi_has_pkg_pretend, _get_eapi_attrs
351 > from portage.elog import elog_process, _preload_elog_modules
352 > from portage.elog.messages import eerror, eqawarn
353 > -from portage.exception import DigestException, FileNotFound, \
354 > - IncorrectParameter, InvalidDependString, PermissionDenied, \
355 > - UnsupportedAPIException
356 > +from portage.exception import (DigestException, FileNotFound,
357 > + IncorrectParameter, InvalidData, InvalidDependString,
358 > + PermissionDenied, UnsupportedAPIException)
359 > from portage.localization import _
360 > from portage.output import colormap
361 > from portage.package.ebuild.prepare_build_dirs import prepare_build_dirs
362 > @@ -76,6 +80,11 @@ from _emerge.EbuildSpawnProcess import EbuildSpawnProcess
363 > from _emerge.Package import Package
364 > from _emerge.RootConfig import RootConfig
365 >
366 > +if sys.hexversion >= 0x3000000:
367 > + _unicode = str
368 > +else:
369 > + _unicode = unicode
370 > +
371 > _unsandboxed_phases = frozenset([
372 > "clean", "cleanrm", "config",
373 > "help", "info", "postinst",
374 > @@ -2250,21 +2259,64 @@ def _post_src_install_soname_symlinks(mysettings, out):
375 > is_libdir_cache[obj_parent] = rval
376 > return rval
377 >
378 > + build_info_dir = os.path.join(
379 > + mysettings['PORTAGE_BUILDDIR'], 'build-info')
380 > + try:
381 > + with io.open(_unicode_encode(os.path.join(build_info_dir,
382 > + "PROVIDES_EXCLUDE"), encoding=_encodings['fs'],
383 > + errors='strict'), mode='r', encoding=_encodings['repo.content'],
384 > + errors='replace') as f:
385 > + provides_exclude = f.read()
386 > + except IOError as e:
387 > + if e.errno not in (errno.ENOENT, errno.ESTALE):
388 > + raise
389 > + provides_exclude = ""
390 > +
391 > + try:
392 > + with io.open(_unicode_encode(os.path.join(build_info_dir,
393 > + "REQUIRES_EXCLUDE"), encoding=_encodings['fs'],
394 > + errors='strict'), mode='r', encoding=_encodings['repo.content'],
395 > + errors='replace') as f:
396 > + requires_exclude = f.read()
397 > + except IOError as e:
398 > + if e.errno not in (errno.ENOENT, errno.ESTALE):
399 > + raise
400 > + requires_exclude = ""
401 > +
402 > missing_symlinks = []
403 > + soname_deps = SonameDepsProcessor(
404 > + provides_exclude, requires_exclude)
405 > +
406 > + # Parse NEEDED.ELF.2 like LinkageMapELF.rebuild() does, and
407 > + # rewrite it to include multilib categories.
408 > + needed_file = portage.util.atomic_ofstream(needed_filename,
409 > + encoding=_encodings["repo.content"], errors="strict")
410 >
411 > - # Parse NEEDED.ELF.2 like LinkageMapELF.rebuild() does.
412 > for l in lines:
413 > l = l.rstrip("\n")
414 > if not l:
415 > continue
416 > - fields = l.split(";")
417 > - if len(fields) < 5:
418 > - portage.util.writemsg_level(_("\nWrong number of fields " \
419 > - "in %s: %s\n\n") % (needed_filename, l),
420 > + try:
421 > + entry = NeededEntry.parse(needed_filename, l)
422 > + except InvalidData as e:
423 > + portage.util.writemsg_level("\n%s\n\n" % (e,),
424 > level=logging.ERROR, noiselevel=-1)
425 > continue
426 >
427 > - obj, soname = fields[1:3]
428 > + filename = os.path.join(image_dir,
429 > + entry.filename.lstrip(os.sep))
430 > + with open(_unicode_encode(filename, encoding=_encodings['fs'],
431 > + errors='strict'), 'rb') as f:
432 > + elf_header = ELFHeader.read(f)
433 > +
434 > + # Compute the multilib category and write it back to the file.
435 > + entry.multilib_category = compute_multilib_category(elf_header)
436 > + needed_file.write(_unicode(entry))
437 > +
438 > + soname_deps.add(entry)
439 > + obj = entry.filename
440 > + soname = entry.soname
441 > +
442 > if not soname:
443 > continue
444 > if not is_libdir(os.path.dirname(obj)):
445 > @@ -2284,6 +2336,22 @@ def _post_src_install_soname_symlinks(mysettings, out):
446 >
447 > missing_symlinks.append((obj, soname))
448 >
449 > + needed_file.close()
450
451 Looks like you really want 'with ... as needed_file:' here :).
452
453 > +
454 > + if soname_deps.requires is not None:
455 > + with io.open(_unicode_encode(os.path.join(build_info_dir,
456 > + 'REQUIRES'), encoding=_encodings['fs'], errors='strict'),
457 > + mode='w', encoding=_encodings['repo.content'],
458 > + errors='strict') as f:
459 > + f.write(soname_deps.requires)
460 > +
461 > + if soname_deps.provides is not None:
462 > + with io.open(_unicode_encode(os.path.join(build_info_dir,
463 > + 'PROVIDES'), encoding=_encodings['fs'], errors='strict'),
464 > + mode='w', encoding=_encodings['repo.content'],
465 > + errors='strict') as f:
466 > + f.write(soname_deps.provides)
467 > +
468 > if not missing_symlinks:
469 > return
470 >
471 > diff --git a/pym/portage/util/_dyn_libs/LinkageMapELF.py b/pym/portage/util/_dyn_libs/LinkageMapELF.py
472 > index 3920f94..c44666a 100644
473 > --- a/pym/portage/util/_dyn_libs/LinkageMapELF.py
474 > +++ b/pym/portage/util/_dyn_libs/LinkageMapELF.py
475 > @@ -11,12 +11,37 @@ from portage import _os_merge
476 > from portage import _unicode_decode
477 > from portage import _unicode_encode
478 > from portage.cache.mappings import slot_dict_class
479 > -from portage.exception import CommandNotFound
480 > +from portage.exception import CommandNotFound, InvalidData
481 > from portage.localization import _
482 > from portage.util import getlibpaths
483 > from portage.util import grabfile
484 > from portage.util import normalize_path
485 > +from portage.util import varexpand
486 > from portage.util import writemsg_level
487 > +from portage.util._dyn_libs.NeededEntry import NeededEntry
488 > +
489 > +# Map ELF e_machine values from NEEDED.ELF.2 to approximate multilib
490 > +# categories. This approximation will produce incorrect results on x32
491 > +# and mips systems, but the result is not worse than using the raw
492 > +# e_machine value which was used by earlier versions of portage.
493 > +_approx_multilib_categories = {
494 > + "386": "x86_32",
495 > + "68K": "m68k_32",
496 > + "AARCH64": "arm_64",
497 > + "ALPHA": "alpha_64",
498 > + "ARM": "arm_32",
499 > + "IA_64": "ia_64",
500 > + "MIPS": "mips_o32",
501 > + "PARISC": "hppa_64",
502 > + "PPC": "ppc_32",
503 > + "PPC64": "ppc_64",
504 > + "S390": "s390_64",
505 > + "SH": "sh_32",
506 > + "SPARC": "sparc_32",
507 > + "SPARC32PLUS": "sparc_32",
508 > + "SPARCV9": "sparc_64",
509 > + "X86_64": "x86_64",
510 > +}
511 >
512 > class LinkageMapELF(object):
513 >
514 > @@ -294,21 +319,31 @@ class LinkageMapELF(object):
515 > "in %s: %s\n\n") % (location, l),
516 > level=logging.ERROR, noiselevel=-1)
517 > continue
518 > - fields = l.split(";")
519 > - if len(fields) < 5:
520 > - writemsg_level(_("\nWrong number of fields " \
521 > - "in %s: %s\n\n") % (location, l),
522 > + try:
523 > + entry = NeededEntry.parse(location, l)
524 > + except InvalidData as e:
525 > + writemsg_level("\n%s\n\n" % (e,),
526 > level=logging.ERROR, noiselevel=-1)
527 > continue
528 > - arch = fields[0]
529 > - obj = fields[1]
530 > - soname = fields[2]
531 > - path = frozenset(normalize_path(x) \
532 > - for x in filter(None, fields[3].replace(
533 > - "${ORIGIN}", os.path.dirname(obj)).replace(
534 > - "$ORIGIN", os.path.dirname(obj)).split(":")))
535 > +
536 > + # If NEEDED.ELF.2 contains the new multilib category field,
537 > + # then use that for categorization. Otherwise, if a mapping
538 > + # exists, map e_machine (entry.arch) to an approximate
539 > + # multilib category. If all else fails, use e_machine, just
540 > + # as older versions of portage did.
541 > + arch = entry.multilib_category
542 > + if arch is None:
543 > + arch = _approx_multilib_categories.get(
544 > + entry.arch, entry.arch)
545 > +
546 > + obj = entry.filename
547 > + soname = entry.soname
548 > + expand = {"ORIGIN": os.path.dirname(entry.filename)}
549 > + path = frozenset(normalize_path(varexpand(x, expand))
550 > + for x in entry.runpaths)
551 > path = frozensets.setdefault(path, path)
552 > - needed = frozenset(x for x in fields[4].split(",") if x)
553 > + needed = frozenset(entry.needed)
554 > +
555 > needed = frozensets.setdefault(needed, needed)
556 >
557 > obj_key = self._obj_key(obj)
558 > diff --git a/pym/portage/util/_dyn_libs/NeededEntry.py b/pym/portage/util/_dyn_libs/NeededEntry.py
559 > new file mode 100644
560 > index 0000000..5de59a0
561 > --- /dev/null
562 > +++ b/pym/portage/util/_dyn_libs/NeededEntry.py
563 > @@ -0,0 +1,83 @@
564 > +# Copyright 2015 Gentoo Foundation
565 > +# Distributed under the terms of the GNU General Public License v2
566 > +
567 > +from __future__ import unicode_literals
568 > +
569 > +import sys
570 > +
571 > +from portage import _encodings, _unicode_encode
572 > +from portage.exception import InvalidData
573 > +from portage.localization import _
574 > +
575 > +class NeededEntry(object):
576 > + """
577 > + Represents one entry (line) from a NEEDED.ELF.2 file. The entry
578 > + must have 5 or more semicolon-delimited fields in order to be
579 > + considered valid. The sixth field is optional, corresponding
580 > + to the multilib category. The multilib_category attribute is
581 > + None if the corresponding field is either empty or missing.
582 > + """
583 > +
584 > + __slots__ = ("arch", "filename", "multilib_category", "needed",
585 > + "runpaths", "soname")
586 > +
587 > + _MIN_FIELDS = 5
588 > + _MULTILIB_CAT_INDEX = 5
589 > +
590 > + @classmethod
591 > + def parse(cls, filename, line):
592 > + """
593 > + Parse a NEEDED.ELF.2 entry. Raises InvalidData if necessary.
594 > +
595 > + @param filename: file name for use in exception messages
596 > + @type filename: str
597 > + @param line: a single line of text from a NEEDED.ELF.2 file,
598 > + without a trailing newline
599 > + @type line: str
600 > + @rtype: NeededEntry
601 > + @return: A new NeededEntry instance containing data from line
602 > + """
603 > + fields = line.split(";")
604 > + if len(fields) < cls._MIN_FIELDS:
605 > + raise InvalidData(_("Wrong number of fields "
606 > + "in %s: %s\n\n") % (filename, line))
607 > +
608 > + obj = cls()
609 > + # Extra fields may exist (for future extensions).
610 > + if (len(fields) > cls._MULTILIB_CAT_INDEX and
611 > + fields[cls._MULTILIB_CAT_INDEX]):
612 > + obj.multilib_category = fields[cls._MULTILIB_CAT_INDEX]
613 > + else:
614 > + obj.multilib_category = None
615 > +
616 > + del fields[cls._MIN_FIELDS:]
617 > + obj.arch, obj.filename, obj.soname, rpaths, needed = fields
618 > + obj.runpaths = tuple(filter(None, rpaths.split(":")))
619 > + obj.needed = tuple(filter(None, needed.split(",")))
620 > +
621 > + return obj
622 > +
623 > + def __str__(self):
624 > + """
625 > + Format this entry for writing to a NEEDED.ELF.2 file.
626 > + """
627 > + return (
628 > + self.arch + ";" +
629 > + self.filename + ";" +
630 > + self.soname + ";" +
631 > + ":".join(self.runpaths) + ";" +
632 > + ",".join(self.needed) +
633 > + (";" + self.multilib_category if self.multilib_category
634 > + is not None else "") +
635 > + "\n"
636
637 How about using ';'.join? Would be definitely clearer in the intention.
638
639 > + )
640 > +
641 > + if sys.hexversion < 0x3000000:
642 > +
643 > + __unicode__ = __str__
644 > +
645 > + def __str__(self):
646 > + return _unicode_encode(self.__unicode__(),
647 > + encoding=_encodings['content'])
648 > +
649 > + __str__.__doc__ = __unicode__.__doc__
650 > diff --git a/pym/portage/util/_dyn_libs/soname_deps.py b/pym/portage/util/_dyn_libs/soname_deps.py
651 > new file mode 100644
652 > index 0000000..b01c3d2
653 > --- /dev/null
654 > +++ b/pym/portage/util/_dyn_libs/soname_deps.py
655 > @@ -0,0 +1,136 @@
656 > +# Copyright 2015 Gentoo Foundation
657 > +# Distributed under the terms of the GNU General Public License v2
658 > +
659 > +import fnmatch
660 > +from itertools import chain
661 > +import os
662 > +import re
663 > +
664 > +from portage.util import shlex_split
665 > +
666 > +class SonameDepsProcessor(object):
667 > + """
668 > + Processes NEEDED.ELF.2 entries for one package, in order to generate
669 > + REQUIRES and PROVIDES data.
670 > +
671 > + Any sonames provided by the package will automatically be filtered
672 > + from the generated REQUIRES and PROVIDES values.
673 > + """
674 > +
675 > + def __init__(self, provides_exclude, requires_exclude):
676 > + """
677 > + @param provides_exclude: PROVIDES_EXCLUDE value
678 > + @type provides_exclude: str
679 > + @param requires_exclude: REQUIRES_EXCLUDE value
680 > + @type requires_exclude: str
681 > + """
682 > + self._provides_exclude = self._exclude_pattern(provides_exclude)
683 > + self._requires_exclude = self._exclude_pattern(requires_exclude)
684 > + self._requires_map = {}
685 > + self._provides_map = {}
686 > + self._provides_unfiltered = {}
687 > + self._provides = None
688 > + self._requires = None
689 > + self._intersected = False
690 > +
691 > + @staticmethod
692 > + def _exclude_pattern(s):
693 > + # shlex_split enables quoted whitespace inside patterns
694 > + if s:
695 > + pat = re.compile("|".join(
696 > + fnmatch.translate(x.lstrip(os.sep))
697 > + for x in shlex_split(s)))
698 > + else:
699 > + pat = None
700 > + return pat
701 > +
702 > + def add(self, entry):
703 > + """
704 > + Add one NEEDED.ELF.2 entry, for inclusion in the generated
705 > + REQUIRES and PROVIDES values.
706 > +
707 > + @param entry: NEEDED.ELF.2 entry
708 > + @type entry: NeededEntry
709 > + """
710 > +
711 > + multilib_cat = entry.multilib_category
712 > + if multilib_cat is None:
713 > + # This usage is invalid. The caller must ensure that
714 > + # the multilib category data is supplied here.
715 > + raise AssertionError(
716 > + "Missing multilib category data: %s" % entry.filename)
717 > +
718 > + if entry.needed and (
719 > + self._requires_exclude is None or
720 > + self._requires_exclude.match(
721 > + entry.filename.lstrip(os.sep)) is None):
722 > + for x in entry.needed:
723 > + if (self._requires_exclude is None or
724 > + self._requires_exclude.match(x) is None):
725 > + self._requires_map.setdefault(
726 > + multilib_cat, set()).add(x)
727 > +
728 > + if entry.soname:
729 > + self._provides_unfiltered.setdefault(
730 > + multilib_cat, set()).add(entry.soname)
731 > +
732 > + if entry.soname and (
733 > + self._provides_exclude is None or
734 > + (self._provides_exclude.match(
735 > + entry.filename.lstrip(os.sep)) is None and
736 > + self._provides_exclude.match(entry.soname) is None)):
737 > + self._provides_map.setdefault(
738 > + multilib_cat, set()).add(entry.soname)
739 > +
740 > + def _intersect(self):
741 > + requires_map = self._requires_map
742 > + provides_map = self._provides_map
743 > + provides_unfiltered = self._provides_unfiltered
744 > +
745 > + for multilib_cat in set(chain(requires_map, provides_map)):
746 > + requires_map.setdefault(multilib_cat, set())
747 > + provides_map.setdefault(multilib_cat, set())
748 > + provides_unfiltered.setdefault(multilib_cat, set())
749 > + for soname in list(requires_map[multilib_cat]):
750 > + if soname in provides_unfiltered[multilib_cat]:
751 > + requires_map[multilib_cat].remove(soname)
752 > +
753 > + provides_data = []
754 > + for multilib_cat in sorted(provides_map):
755 > + if provides_map[multilib_cat]:
756 > + provides_data.append(multilib_cat + ":")
757 > + provides_data.extend(sorted(provides_map[multilib_cat]))
758 > +
759 > + if provides_data:
760 > + self._provides = " ".join(provides_data) + "\n"
761 > +
762 > + requires_data = []
763 > + for multilib_cat in sorted(requires_map):
764 > + if requires_map[multilib_cat]:
765 > + requires_data.append(multilib_cat + ":")
766 > + requires_data.extend(sorted(requires_map[multilib_cat]))
767 > +
768 > + if requires_data:
769 > + self._requires = " ".join(requires_data) + "\n"
770 > +
771 > + self._intersected = True
772 > +
773 > + @property
774 > + def provides(self):
775 > + """
776 > + @rtype: str
777 > + @return: PROVIDES value generated from NEEDED.ELF.2 entries
778 > + """
779 > + if not self._intersected:
780 > + self._intersect()
781 > + return self._provides
782 > +
783 > + @property
784 > + def requires(self):
785 > + """
786 > + @rtype: str
787 > + @return: REQUIRES value generated from NEEDED.ELF.2 entries
788 > + """
789 > + if not self._intersected:
790 > + self._intersect()
791 > + return self._requires
792 > diff --git a/pym/portage/util/elf/__init__.py b/pym/portage/util/elf/__init__.py
793 > new file mode 100644
794 > index 0000000..4725d33
795 > --- /dev/null
796 > +++ b/pym/portage/util/elf/__init__.py
797 > @@ -0,0 +1,2 @@
798 > +# Copyright 2015 Gentoo Foundation
799 > +# Distributed under the terms of the GNU General Public License v2
800 > diff --git a/pym/portage/util/elf/constants.py b/pym/portage/util/elf/constants.py
801 > new file mode 100644
802 > index 0000000..3857b71
803 > --- /dev/null
804 > +++ b/pym/portage/util/elf/constants.py
805 > @@ -0,0 +1,36 @@
806 > +# Copyright 2015 Gentoo Foundation
807 > +# Distributed under the terms of the GNU General Public License v2
808
809 I think you could mention here where those constants come from so that
810 others can easily find them and update for new arches. Then it would be
811 probably good to note what else needs to be updated :).
812
813 > +
814 > +EI_CLASS = 4
815 > +ELFCLASS32 = 1
816 > +ELFCLASS64 = 2
817 > +
818 > +EI_DATA = 5
819 > +ELFDATA2LSB = 1
820 > +ELFDATA2MSB = 2
821 > +
822 > +E_MACHINE = 18
823 > +EM_SPARC = 2
824 > +EM_386 = 3
825 > +EM_68K = 4
826 > +EM_MIPS = 8
827 > +EM_PARISC = 15
828 > +EM_SPARC32PLUS = 18
829 > +EM_PPC = 20
830 > +EM_PPC64 = 21
831 > +EM_S390 = 22
832 > +EM_ARM = 40
833 > +EM_ALPHA = 41
834 > +EM_SH = 42
835 > +EM_SPARCV9 = 43
836 > +EM_IA_64 = 50
837 > +EM_X86_64 = 62
838 > +EM_AARCH64 = 183
839 > +
840 > +E_ENTRY = 24
841 > +EF_MIPS_ABI = 0x0000F000
842 > +EF_MIPS_ABI2 = 0x00000020
843 > +E_MIPS_ABI_O32 = 0x00001000
844 > +E_MIPS_ABI_O64 = 0x00002000
845 > +E_MIPS_ABI_EABI32 = 0x00003000
846 > +E_MIPS_ABI_EABI64 = 0x00004000
847 > diff --git a/pym/portage/util/elf/header.py b/pym/portage/util/elf/header.py
848 > new file mode 100644
849 > index 0000000..3310eeb
850 > --- /dev/null
851 > +++ b/pym/portage/util/elf/header.py
852 > @@ -0,0 +1,62 @@
853 > +# Copyright 2015 Gentoo Foundation
854 > +# Distributed under the terms of the GNU General Public License v2
855 > +
856 > +import collections
857 > +
858 > +from ..endian.decode import (decode_uint16_le, decode_uint32_le,
859 > + decode_uint16_be, decode_uint32_be)
860 > +from .constants import (E_ENTRY, E_MACHINE, EI_CLASS, ELFCLASS32,
861 > + ELFCLASS64, ELFDATA2LSB, ELFDATA2MSB)
862 > +
863 > +class ELFHeader(object):
864 > +
865 > + __slots__ = ('e_flags', 'e_machine', 'ei_class', 'ei_data')
866 > +
867 > + @classmethod
868 > + def read(cls, f):
869 > + """
870 > + @param f: an open ELF file
871 > + @type f: file
872 > + @rtype: ELFHeader
873 > + @return: A new ELFHeader instance containing data from f
874 > + """
875 > + f.seek(EI_CLASS)
876 > + ei_class = ord(f.read(1))
877 > + ei_data = ord(f.read(1))
878 > +
879 > + if ei_class == ELFCLASS32:
880 > + width = 32
881 > + elif ei_class == ELFCLASS64:
882 > + width = 64
883 > + else:
884 > + width = None
885 > +
886 > + if ei_data == ELFDATA2LSB:
887 > + uint16 = decode_uint16_le
888 > + uint32 = decode_uint32_le
889 > + elif ei_data == ELFDATA2MSB:
890 > + uint16 = decode_uint16_be
891 > + uint32 = decode_uint32_be
892 > + else:
893 > + uint16 = None
894 > + uint32 = None
895 > +
896 > + if width is None or uint16 is None:
897 > + e_machine = None
898 > + e_flags = None
899 > + else:
900 > + f.seek(E_MACHINE)
901 > + e_machine = uint16(f.read(2))
902 > +
903 > + # E_ENTRY + 3 * sizeof(uintN)
904 > + e_flags_offset = E_ENTRY + 3 * width // 8
905 > + f.seek(e_flags_offset)
906 > + e_flags = uint32(f.read(4))
907 > +
908 > + obj = cls()
909 > + obj.e_flags = e_flags
910 > + obj.e_machine = e_machine
911 > + obj.ei_class = ei_class
912 > + obj.ei_data = ei_data
913 > +
914 > + return obj
915 > diff --git a/pym/portage/util/endian/__init__.py b/pym/portage/util/endian/__init__.py
916 > new file mode 100644
917 > index 0000000..4725d33
918 > --- /dev/null
919 > +++ b/pym/portage/util/endian/__init__.py
920 > @@ -0,0 +1,2 @@
921 > +# Copyright 2015 Gentoo Foundation
922 > +# Distributed under the terms of the GNU General Public License v2
923 > diff --git a/pym/portage/util/endian/decode.py b/pym/portage/util/endian/decode.py
924 > new file mode 100644
925 > index 0000000..ec0dcec
926 > --- /dev/null
927 > +++ b/pym/portage/util/endian/decode.py
928 > @@ -0,0 +1,56 @@
929 > +# Copyright 2015 Gentoo Foundation
930 > +# Distributed under the terms of the GNU General Public License v2
931 > +
932 > +def decode_uint16_be(data):
933 > + """
934 > + Decode an unsigned 16-bit integer with big-endian encoding.
935 > +
936 > + @param data: string of bytes of length 2
937 > + @type data: bytes
938 > + @rtype: int
939 > + @return: unsigned integer value of the decoded data
940 > + """
941 > + return (ord(data[0:1]) << 8) + ord(data[1:2])
942 > +
943 > +def decode_uint16_le(data):
944 > + """
945 > + Decode an unsigned 16-bit integer with little-endian encoding.
946 > +
947 > + @param data: string of bytes of length 2
948 > + @type data: bytes
949 > + @rtype: int
950 > + @return: unsigned integer value of the decoded data
951 > + """
952 > + return ord(data[0:1]) + (ord(data[1:2]) << 8)
953 > +
954 > +def decode_uint32_be(data):
955 > + """
956 > + Decode an unsigned 32-bit integer with big-endian encoding.
957 > +
958 > + @param data: string of bytes of length 4
959 > + @type data: bytes
960 > + @rtype: int
961 > + @return: unsigned integer value of the decoded data
962 > + """
963 > + return (
964 > + (ord(data[0:1]) << 24) +
965 > + (ord(data[1:2]) << 16) +
966 > + (ord(data[2:3]) << 8) +
967 > + ord(data[3:4])
968 > + )
969 > +
970 > +def decode_uint32_le(data):
971 > + """
972 > + Decode an unsigned 32-bit integer with little-endian encoding.
973 > +
974 > + @param data: string of bytes of length 4
975 > + @type data: bytes
976 > + @rtype: int
977 > + @return: unsigned integer value of the decoded data
978 > + """
979 > + return (
980 > + ord(data[0:1]) +
981 > + (ord(data[1:2]) << 8) +
982 > + (ord(data[2:3]) << 16) +
983 > + (ord(data[3:4]) << 24)
984 > + )
985
986 How about using the struct module instead of reinventing the wheel?
987
988 --
989 Best regards,
990 Michał Górny

Replies