1 |
On 01/26/15 22:16, Zac Medico wrote: |
2 |
> Generate soname dependency metadata for binary and installed packages, |
3 |
> in the form of PROVIDES and REQUIRES metadata. It is useful to generate |
4 |
> PROVIDES and REQUIRES metadata now, so that it will be available |
5 |
> when dependency resolver support is added in the future. Note that |
6 |
> slot-operator dependencies will not be able to serve as a substitute |
7 |
> for soname dependencies for the forseeable future, because system |
8 |
> dependencies are frequently unspecified (according to Gentoo policy). |
9 |
|
10 |
Thanks for putting this into the commit message. Getting the point |
11 |
across to people was at times painful. |
12 |
|
13 |
> |
14 |
> The PROVIDES/REQUIRES system is very similar to the automatic Requires |
15 |
> and Provides system which is supported by RPM. The PROVIDES/REQUIRES |
16 |
> metadata is generated automatically from the ELF files that are |
17 |
> installed by a package. The PROVIDES/REQUIRES syntax is described in |
18 |
> the /var/db/pkg section of the portage(5) man page. REQUIRES_EXCLUDE |
19 |
> and PROVIDES_EXCLUDE ebuild variables allow for filtering of the |
20 |
> sonames that are saved in REQUIRES and PROVIDES (see the ebuild(5) man |
21 |
> page for details). |
22 |
|
23 |
ELF gets us like 99% of the way but will eventually have to think about |
24 |
Mach-O for gentoo-on-mac etc. Maybe COFF, a.out. Not sure where to |
25 |
draw the line. |
26 |
|
27 |
> |
28 |
> The /var/db/pkg NEEDED.ELF.2 format now includes an additional field |
29 |
> which indicates the multilib category, as discussed in bug #534206. The |
30 |
> multilib category is used to categorize the sonames that are listed in |
31 |
> PROVIDES/REQUIRES metadata, since sonames need to be resolved |
32 |
> separately for each multilib category. The complete list of supported |
33 |
> multilib categories is documented in the comments of the |
34 |
> portage.dep.soname.multilib_category module. |
35 |
> |
36 |
> X-Gentoo-Bug: 282639 |
37 |
> X-Gentoo-Bug-URL: https://bugs.gentoo.org/show_bug.cgi?id=282639 |
38 |
> --- |
39 |
> bin/ebuild.sh | 2 +- |
40 |
> bin/phase-functions.sh | 2 +- |
41 |
> man/ebuild.5 | 12 +++ |
42 |
> man/portage.5 | 25 +++++ |
43 |
> pym/_emerge/Package.py | 3 +- |
44 |
> pym/portage/dbapi/bintree.py | 5 +- |
45 |
> pym/portage/dbapi/vartree.py | 1 + |
46 |
> pym/portage/dep/soname/__init__.py | 2 + |
47 |
> pym/portage/dep/soname/multilib_category.py | 112 +++++++++++++++++++++++ |
48 |
> pym/portage/package/ebuild/doebuild.py | 86 ++++++++++++++++-- |
49 |
> pym/portage/util/_dyn_libs/LinkageMapELF.py | 61 ++++++++++--- |
50 |
> pym/portage/util/_dyn_libs/NeededEntry.py | 83 +++++++++++++++++ |
51 |
> pym/portage/util/_dyn_libs/soname_deps.py | 136 ++++++++++++++++++++++++++++ |
52 |
> pym/portage/util/elf/__init__.py | 2 + |
53 |
> pym/portage/util/elf/constants.py | 36 ++++++++ |
54 |
> pym/portage/util/elf/header.py | 62 +++++++++++++ |
55 |
> pym/portage/util/endian/__init__.py | 2 + |
56 |
> pym/portage/util/endian/decode.py | 56 ++++++++++++ |
57 |
> 18 files changed, 661 insertions(+), 27 deletions(-) |
58 |
> create mode 100644 pym/portage/dep/soname/__init__.py |
59 |
> create mode 100644 pym/portage/dep/soname/multilib_category.py |
60 |
> create mode 100644 pym/portage/util/_dyn_libs/NeededEntry.py |
61 |
> create mode 100644 pym/portage/util/_dyn_libs/soname_deps.py |
62 |
> create mode 100644 pym/portage/util/elf/__init__.py |
63 |
> create mode 100644 pym/portage/util/elf/constants.py |
64 |
> create mode 100644 pym/portage/util/elf/header.py |
65 |
> create mode 100644 pym/portage/util/endian/__init__.py |
66 |
> create mode 100644 pym/portage/util/endian/decode.py |
67 |
> |
68 |
|
69 |
I can't speak to the integration into portage. I'm sure you got that |
70 |
right. I'll comment on the elf-ish stuff. |
71 |
|
72 |
> diff --git a/man/portage.5 b/man/portage.5 |
73 |
> index 189561c..bf159fd 100644 |
74 |
> --- a/man/portage.5 |
75 |
> +++ b/man/portage.5 |
76 |
> @@ -1443,6 +1443,31 @@ can be changed quickly. Generally though there is one file per environment |
77 |
> variable that "matters" (like CFLAGS) with the contents stored inside of it. |
78 |
> Another common file is the CONTENTS file which lists the path and hashes of |
79 |
> all objects that the package installed onto your system. |
80 |
> +.TP |
81 |
> +.BR PROVIDES |
82 |
> +Contains information about the sonames that a package provides, which is |
83 |
> +automatically generated from the files that it installs. The sonames |
84 |
> +may have been filtered by the \fBPROVIDES_EXCLUDE\fR \fBebuild\fR(5) |
85 |
> +variable. A multilib category, followed by a colon, always preceeds a |
86 |
> +list of one or more sonames. |
87 |
> + |
88 |
> +.I Example: |
89 |
> +.nf |
90 |
> +x86_32: libcom_err.so.2 libss.so.2 x86_64: libcom_err.so.2 libss.so.2 |
91 |
> +.fi |
92 |
> +.TP |
93 |
> +.BR REQUIRES |
94 |
> +Contains information about the sonames that a package requires, which is |
95 |
> +automatically generated from the files that it installs. The sonames |
96 |
> +may have been filtered by the \fBREQUIRES_EXCLUDE\fR \fBebuild\fR(5) |
97 |
> +variable. Any sonames that a package provides are automatically excluded |
98 |
> +from \fBREQUIRES\fR. A multilib category, followed by a colon, always |
99 |
> +preceeds a list of one or more sonames. |
100 |
> + |
101 |
> +.I Example: |
102 |
> +.nf |
103 |
> +x86_32: ld-linux.so.2 libc.so.6 x86_64: ld-linux-x86-64.so.2 libc.so.6 |
104 |
> +.fi |
105 |
> .RE |
106 |
> .TP |
107 |
> .BR /var/lib/portage/ |
108 |
|
109 |
I'm a bit confused here. So PROVIDES and REQUIRES are new files in |
110 |
/var/db/pkg/<cat>/<pkg>? And the format of NEEDED.ELF.2 is not |
111 |
changing? Correct? |
112 |
|
113 |
> diff --git a/pym/portage/dep/soname/__init__.py b/pym/portage/dep/soname/__init__.py |
114 |
> new file mode 100644 |
115 |
> index 0000000..4725d33 |
116 |
> --- /dev/null |
117 |
> +++ b/pym/portage/dep/soname/__init__.py |
118 |
> @@ -0,0 +1,2 @@ |
119 |
> +# Copyright 2015 Gentoo Foundation |
120 |
> +# Distributed under the terms of the GNU General Public License v2 |
121 |
> diff --git a/pym/portage/dep/soname/multilib_category.py b/pym/portage/dep/soname/multilib_category.py |
122 |
> new file mode 100644 |
123 |
> index 0000000..8cc8fd3 |
124 |
> --- /dev/null |
125 |
> +++ b/pym/portage/dep/soname/multilib_category.py |
126 |
> @@ -0,0 +1,112 @@ |
127 |
> +# Copyright 2015 Gentoo Foundation |
128 |
> +# Distributed under the terms of the GNU General Public License v2 |
129 |
> +# |
130 |
> +# Compute a multilib category, as discussed here: |
131 |
> +# |
132 |
> +# https://bugs.gentoo.org/show_bug.cgi?id=534206 |
133 |
> +# |
134 |
> +# Supported categories: |
135 |
> +# |
136 |
> +# alpha_{32,64} |
137 |
> +# arm_{32,64} |
138 |
|
139 |
We should note at some point that arm_32 means EABI and not OABI which |
140 |
are two different 32-bit arm abis. We don't support oabi in gentoo |
141 |
anymore, but still this can be confusing. If it did come up in some |
142 |
context, we could distinguish it as arm_o32 like mips below. |
143 |
|
144 |
We probably don't have to note that arm_64 is aarch6 which is really a |
145 |
different ISA, not backwards compatible to 32-bit arm. As you say below |
146 |
this is just a naming convention, but in every other case each line does |
147 |
refer to one ISA, except maybe ia_32 can be confused with x86. |
148 |
|
149 |
> +# hppa_{32,64} |
150 |
> +# ia_{32,64} |
151 |
> +# m68k_{32,64} |
152 |
> +# mips_{eabi32,eabi64,n32,n64,o32,o64} |
153 |
> +# ppc_{32,64} |
154 |
> +# s390_{32,64} |
155 |
> +# sh_{32,64} |
156 |
> +# sparc_{32,64} |
157 |
> +# x86_{32,64,x32} |
158 |
> +# |
159 |
> +# NOTES: |
160 |
> +# |
161 |
> +# * The ABIs referenced by some of the above *_32 and *_64 categories |
162 |
> +# may be imaginary, but they are listed anyway, since the goal is to |
163 |
> +# establish a naming convention that is as consistent and uniform as |
164 |
> +# possible. |
165 |
> +# |
166 |
> +# * The Elf header's e_ident[EI_OSABI] byte is completely ignored, |
167 |
> +# since OS-independence is one of the goals. The assumption is that, |
168 |
> +# for given installation, we are only interested in tracking multilib |
169 |
> +# ABIs for a single OS. |
170 |
|
171 |
If you run readelf -h on (say) bash in any of our stage3's tarballs you |
172 |
always get "OS/ABI: UNIX - System V" irrespective of arch and abi. I |
173 |
don't know what you would get on BSD, but the field is totally |
174 |
irrelevant for our purposes despite the name. As far as I can tell, it |
175 |
is totally invariant across arches and abis. |
176 |
|
177 |
You can even unpack the the stage3's on an amd64 host and run readelf |
178 |
form the host on the chroot target and you'll get the elf header, so you |
179 |
don't need access to native hardware. |
180 |
|
181 |
The comment suggests that there might be some interesting information |
182 |
there, but there isn't. Maybe I'm just reading too much into it. |
183 |
|
184 |
> + |
185 |
> +from ...util.elf.constants import ( |
186 |
> + EF_MIPS_ABI, EF_MIPS_ABI2, ELFCLASS32, ELFCLASS64, |
187 |
> + EM_386, EM_68K, EM_AARCH64, EM_ALPHA, EM_ARM, EM_IA_64, EM_MIPS, |
188 |
> + EM_PARISC, EM_PPC, EM_PPC64, EM_S390, EM_SH, EM_SPARC, |
189 |
> + EM_SPARC32PLUS, EM_SPARCV9, EM_X86_64, E_MIPS_ABI_EABI32, |
190 |
> + E_MIPS_ABI_EABI64, E_MIPS_ABI_O32, E_MIPS_ABI_O64) |
191 |
> + |
192 |
> +_machine_prefix_map = { |
193 |
> + EM_386: "x86", |
194 |
> + EM_68K: "m68k", |
195 |
> + EM_AARCH64: "arm", |
196 |
> + EM_ALPHA: "alpha", |
197 |
> + EM_ARM: "arm", |
198 |
> + EM_IA_64: "ia", |
199 |
> + EM_MIPS: "mips", |
200 |
> + EM_PARISC: "hppa", |
201 |
> + EM_PPC: "ppc", |
202 |
> + EM_PPC64: "ppc", |
203 |
> + EM_S390: "s390", |
204 |
> + EM_SH: "sh", |
205 |
> + EM_SPARC: "sparc", |
206 |
> + EM_SPARC32PLUS: "sparc", |
207 |
> + EM_SPARCV9: "sparc", |
208 |
> + EM_X86_64: "x86", |
209 |
> +} |
210 |
> + |
211 |
> +_mips_abi_map = { |
212 |
> + E_MIPS_ABI_EABI32: "eabi32", |
213 |
> + E_MIPS_ABI_EABI64: "eabi64", |
214 |
> + E_MIPS_ABI_O32: "o32", |
215 |
> + E_MIPS_ABI_O64: "o64", |
216 |
> +} |
217 |
> + |
218 |
> +def _compute_suffix_mips(elf_header): |
219 |
> + |
220 |
> + name = None |
221 |
> + mips_abi = elf_header.e_flags & EF_MIPS_ABI |
222 |
> + |
223 |
> + if mips_abi: |
224 |
> + name = _mips_abi_map.get(mips_abi) |
225 |
> + elif elf_header.e_flags & EF_MIPS_ABI2: |
226 |
> + name = "n32" |
227 |
> + elif elf_header.ei_class == ELFCLASS64: |
228 |
> + name = "n64" |
229 |
> + |
230 |
> + return name |
231 |
> + |
232 |
> +def compute_multilib_category(elf_header): |
233 |
> + """ |
234 |
> + Compute a multilib category from an ELF header. |
235 |
> + |
236 |
> + @param elf_header: an ELFHeader instance |
237 |
> + @type elf_header: ELFHeader |
238 |
> + @rtype: str |
239 |
> + @return: A multilib category, or None if elf_header does not fit |
240 |
> + into a recognized category |
241 |
> + """ |
242 |
> + category = None |
243 |
> + if elf_header.e_machine is not None: |
244 |
> + |
245 |
> + prefix = _machine_prefix_map.get(elf_header.e_machine) |
246 |
> + suffix = None |
247 |
> + |
248 |
> + if prefix == "mips": |
249 |
> + suffix = _compute_suffix_mips(elf_header) |
250 |
> + elif elf_header.ei_class == ELFCLASS64: |
251 |
> + suffix = "64" |
252 |
> + elif elf_header.ei_class == ELFCLASS32: |
253 |
> + if elf_header.e_machine == EM_X86_64: |
254 |
> + suffix = "x32" |
255 |
> + else: |
256 |
> + suffix = "32" |
257 |
> + |
258 |
> + if prefix is None or suffix is None: |
259 |
> + category = None |
260 |
> + else: |
261 |
> + category = "%s_%s" % (prefix, suffix) |
262 |
> + |
263 |
> + return c |
264 |
|
265 |
Looks good. |
266 |
|
267 |
> diff --git a/pym/portage/util/_dyn_libs/NeededEntry.py b/pym/portage/util/_dyn_libs/NeededEntry.py |
268 |
> new file mode 100644 |
269 |
> index 0000000..5de59a0 |
270 |
> --- /dev/null |
271 |
> +++ b/pym/portage/util/_dyn_libs/NeededEntry.py |
272 |
> @@ -0,0 +1,83 @@ |
273 |
> +# Copyright 2015 Gentoo Foundation |
274 |
> +# Distributed under the terms of the GNU General Public License v2 |
275 |
> + |
276 |
> +from __future__ import unicode_literals |
277 |
> + |
278 |
> +import sys |
279 |
> + |
280 |
> +from portage import _encodings, _unicode_encode |
281 |
> +from portage.exception import InvalidData |
282 |
> +from portage.localization import _ |
283 |
> + |
284 |
> +class NeededEntry(object): |
285 |
> + """ |
286 |
> + Represents one entry (line) from a NEEDED.ELF.2 file. The entry |
287 |
> + must have 5 or more semicolon-delimited fields in order to be |
288 |
> + considered valid. The sixth field is optional, corresponding |
289 |
> + to the multilib category. The multilib_category attribute is |
290 |
> + None if the corresponding field is either empty or missing. |
291 |
> + """ |
292 |
> + |
293 |
> + __slots__ = ("arch", "filename", "multilib_category", "needed", |
294 |
> + "runpaths", "soname") |
295 |
|
296 |
Looks like this answers my question above about the format of NEEDED.ELF.2 |
297 |
|
298 |
py b/pym/portage/util/_dyn_libs/soname_deps.py |
299 |
> new file mode 100644 |
300 |
> index 0000000..b01c3d2 |
301 |
> --- /dev/null |
302 |
> +++ b/pym/portage/util/_dyn_libs/soname_deps.py |
303 |
> @@ -0,0 +1,136 @@ |
304 |
> +# Copyright 2015 Gentoo Foundation |
305 |
> +# Distributed under the terms of the GNU General Public License v2 |
306 |
> + |
307 |
> +import fnmatch |
308 |
> +from itertools import chain |
309 |
> +import os |
310 |
> +import re |
311 |
> + |
312 |
> +from portage.util import shlex_split |
313 |
> + |
314 |
> +class SonameDepsProcessor(object): |
315 |
> + """ |
316 |
> + Processes NEEDED.ELF.2 entries for one package, in order to generate |
317 |
> + REQUIRES and PROVIDES data. |
318 |
> + |
319 |
> + Any sonames provided by the package will automatically be filtered |
320 |
> + from the generated REQUIRES and PROVIDES values. |
321 |
> + """ |
322 |
> + |
323 |
> + def __init__(self, provides_exclude, requires_exclude): |
324 |
> + """ |
325 |
> + @param provides_exclude: PROVIDES_EXCLUDE value |
326 |
> + @type provides_exclude: str |
327 |
> + @param requires_exclude: REQUIRES_EXCLUDE value |
328 |
> + @type requires_exclude: str |
329 |
> + """ |
330 |
> + self._provides_exclude = self._exclude_pattern(provides_exclude) |
331 |
> + self._requires_exclude = self._exclude_pattern(requires_exclude) |
332 |
> + self._requires_map = {} |
333 |
> + self._provides_map = {} |
334 |
> + self._provides_unfiltered = {} |
335 |
> + self._provides = None |
336 |
> + self._requires = None |
337 |
> + self._intersected = False |
338 |
> + |
339 |
> + @staticmethod |
340 |
> + def _exclude_pattern(s): |
341 |
> + # shlex_split enables quoted whitespace inside patterns |
342 |
> + if s: |
343 |
> + pat = re.compile("|".join( |
344 |
> + fnmatch.translate(x.lstrip(os.sep)) |
345 |
> + for x in shlex_split(s))) |
346 |
> + else: |
347 |
> + pat = None |
348 |
> + return pat |
349 |
> + |
350 |
> + def add(self, entry): |
351 |
> + """ |
352 |
> + Add one NEEDED.ELF.2 entry, for inclusion in the generated |
353 |
> + REQUIRES and PROVIDES values. |
354 |
> + |
355 |
> + @param entry: NEEDED.ELF.2 entry |
356 |
> + @type entry: NeededEntry |
357 |
> + """ |
358 |
> + |
359 |
> + multilib_cat = entry.multilib_category |
360 |
> + if multilib_cat is None: |
361 |
> + # This usage is invalid. The caller must ensure that |
362 |
> + # the multilib category data is supplied here. |
363 |
> + raise AssertionError( |
364 |
> + "Missing multilib category data: %s" % entry.filename) |
365 |
> + |
366 |
> + if entry.needed and ( |
367 |
> + self._requires_exclude is None or |
368 |
> + self._requires_exclude.match( |
369 |
> + entry.filename.lstrip(os.sep)) is None): |
370 |
> + for x in entry.needed: |
371 |
> + if (self._requires_exclude is None or |
372 |
> + self._requires_exclude.match(x) is None): |
373 |
> + self._requires_map.setdefault( |
374 |
> + multilib_cat, set()).add(x) |
375 |
> + |
376 |
> + if entry.soname: |
377 |
> + self._provides_unfiltered.setdefault( |
378 |
> + multilib_cat, set()).add(entry.soname) |
379 |
> + |
380 |
> + if entry.soname and ( |
381 |
> + self._provides_exclude is None or |
382 |
> + (self._provides_exclude.match( |
383 |
> + entry.filename.lstrip(os.sep)) is None and |
384 |
> + self._provides_exclude.match(entry.soname) is None)): |
385 |
> + self._provides_map.setdefault( |
386 |
> + multilib_cat, set()).add(entry.soname) |
387 |
> + |
388 |
> + def _intersect(self): |
389 |
> + requires_map = self._requires_map |
390 |
> + provides_map = self._provides_map |
391 |
> + provides_unfiltered = self._provides_unfiltered |
392 |
> + |
393 |
> + for multilib_cat in set(chain(requires_map, provides_map)): |
394 |
> + requires_map.setdefault(multilib_cat, set()) |
395 |
> + provides_map.setdefault(multilib_cat, set()) |
396 |
> + provides_unfiltered.setdefault(multilib_cat, set()) |
397 |
> + for soname in list(requires_map[multilib_cat]): |
398 |
> + if soname in provides_unfiltered[multilib_cat]: |
399 |
> + requires_map[multilib_cat].remove(soname) |
400 |
> + |
401 |
> + provides_data = [] |
402 |
> + for multilib_cat in sorted(provides_map): |
403 |
> + if provides_map[multilib_cat]: |
404 |
> + provides_data.append(multilib_cat + ":") |
405 |
> + provides_data.extend(sorted(provides_map[multilib_cat])) |
406 |
> + |
407 |
> + if provides_data: |
408 |
> + self._provides = " ".join(provides_data) + "\n" |
409 |
> + |
410 |
> + requires_data = [] |
411 |
> + for multilib_cat in sorted(requires_map): |
412 |
> + if requires_map[multilib_cat]: |
413 |
> + requires_data.append(multilib_cat + ":") |
414 |
> + requires_data.extend(sorted(requires_map[multilib_cat])) |
415 |
> + |
416 |
> + if requires_data: |
417 |
> + self._requires = " ".join(requires_data) + "\n" |
418 |
> + |
419 |
> + self._intersected = True |
420 |
> + |
421 |
> + @property |
422 |
> + def provides(self): |
423 |
> + """ |
424 |
> + @rtype: str |
425 |
> + @return: PROVIDES value generated from NEEDED.ELF.2 entries |
426 |
> + """ |
427 |
> + if not self._intersected: |
428 |
> + self._intersect() |
429 |
> + return self._provides |
430 |
> + |
431 |
> + @property |
432 |
> + def requires(self): |
433 |
> + """ |
434 |
> + @rtype: str |
435 |
> + @return: REQUIRES value generated from NEEDED.ELF.2 entries |
436 |
> + """ |
437 |
> + if not self._intersected: |
438 |
> + self._intersect() |
439 |
> + return self._requires |
440 |
|
441 |
Nice logic here :) The only thing that I don't get is why we might need |
442 |
{provides,requires}_exclude patterns. I guess its good design |
443 |
principles but I can't think of a use case. |
444 |
|
445 |
> diff --git a/pym/portage/util/elf/__init__.py b/pym/portage/util/elf/__init__.py |
446 |
> new file mode 100644 |
447 |
> index 0000000..4725d33 |
448 |
> --- /dev/null |
449 |
> +++ b/pym/portage/util/elf/__init__.py |
450 |
> @@ -0,0 +1,2 @@ |
451 |
> +# Copyright 2015 Gentoo Foundation |
452 |
> +# Distributed under the terms of the GNU General Public License v2 |
453 |
> diff --git a/pym/portage/util/elf/constants.py b/pym/portage/util/elf/constants.py |
454 |
> new file mode 100644 |
455 |
> index 0000000..3857b71 |
456 |
> --- /dev/null |
457 |
> +++ b/pym/portage/util/elf/constants.py |
458 |
> @@ -0,0 +1,36 @@ |
459 |
> +# Copyright 2015 Gentoo Foundation |
460 |
> +# Distributed under the terms of the GNU General Public License v2 |
461 |
> + |
462 |
> +EI_CLASS = 4 |
463 |
> +ELFCLASS32 = 1 |
464 |
> +ELFCLASS64 = 2 |
465 |
> + |
466 |
> +EI_DATA = 5 |
467 |
> +ELFDATA2LSB = 1 |
468 |
> +ELFDATA2MSB = 2 |
469 |
> + |
470 |
> +E_MACHINE = 18 |
471 |
> +EM_SPARC = 2 |
472 |
> +EM_386 = 3 |
473 |
> +EM_68K = 4 |
474 |
> +EM_MIPS = 8 |
475 |
> +EM_PARISC = 15 |
476 |
> +EM_SPARC32PLUS = 18 |
477 |
> +EM_PPC = 20 |
478 |
> +EM_PPC64 = 21 |
479 |
> +EM_S390 = 22 |
480 |
> +EM_ARM = 40 |
481 |
> +EM_ALPHA = 41 |
482 |
> +EM_SH = 42 |
483 |
> +EM_SPARCV9 = 43 |
484 |
> +EM_IA_64 = 50 |
485 |
> +EM_X86_64 = 62 |
486 |
> +EM_AARCH64 = 183 |
487 |
> + |
488 |
> +E_ENTRY = 24 |
489 |
> +EF_MIPS_ABI = 0x0000F000 |
490 |
> +EF_MIPS_ABI2 = 0x00000020 |
491 |
> +E_MIPS_ABI_O32 = 0x00001000 |
492 |
> +E_MIPS_ABI_O64 = 0x00002000 |
493 |
> +E_MIPS_ABI_EABI32 = 0x00003000 |
494 |
> +E_MIPS_ABI_EABI64 = 0x00004000 |
495 |
|
496 |
Document where these are coming from else we'll loose the connection to |
497 |
the standard. They should all be in <elf.h> provided by elfutils, but |
498 |
I'm sure standard is documented somewhere officially. Even if you just |
499 |
say "see <elf.h>" that might be enough to clue people where to look for |
500 |
these definitions. |
501 |
|
502 |
> diff --git a/pym/portage/util/elf/header.py b/pym/portage/util/elf/header.py |
503 |
> new file mode 100644 |
504 |
> index 0000000..3310eeb |
505 |
> --- /dev/null |
506 |
> +++ b/pym/portage/util/elf/header.py |
507 |
> @@ -0,0 +1,62 @@ |
508 |
> +# Copyright 2015 Gentoo Foundation |
509 |
> +# Distributed under the terms of the GNU General Public License v2 |
510 |
> + |
511 |
> +import collections |
512 |
> + |
513 |
> +from ..endian.decode import (decode_uint16_le, decode_uint32_le, |
514 |
> + decode_uint16_be, decode_uint32_be) |
515 |
> +from .constants import (E_ENTRY, E_MACHINE, EI_CLASS, ELFCLASS32, |
516 |
> + ELFCLASS64, ELFDATA2LSB, ELFDATA2MSB) |
517 |
> + |
518 |
> +class ELFHeader(object): |
519 |
> + |
520 |
> + __slots__ = ('e_flags', 'e_machine', 'ei_class', 'ei_data') |
521 |
> + |
522 |
> + @classmethod |
523 |
> + def read(cls, f): |
524 |
> + """ |
525 |
> + @param f: an open ELF file |
526 |
> + @type f: file |
527 |
> + @rtype: ELFHeader |
528 |
> + @return: A new ELFHeader instance containing data from f |
529 |
> + """ |
530 |
> + f.seek(EI_CLASS) |
531 |
> + ei_class = ord(f.read(1)) |
532 |
> + ei_data = ord(f.read(1)) |
533 |
> + |
534 |
> + if ei_class == ELFCLASS32: |
535 |
> + width = 32 |
536 |
> + elif ei_class == ELFCLASS64: |
537 |
> + width = 64 |
538 |
> + else: |
539 |
> + width = None |
540 |
> + |
541 |
> + if ei_data == ELFDATA2LSB: |
542 |
> + uint16 = decode_uint16_le |
543 |
> + uint32 = decode_uint32_le |
544 |
> + elif ei_data == ELFDATA2MSB: |
545 |
> + uint16 = decode_uint16_be |
546 |
> + uint32 = decode_uint32_be |
547 |
> + else: |
548 |
> + uint16 = None |
549 |
> + uint32 = None |
550 |
> + |
551 |
> + if width is None or uint16 is None: |
552 |
> + e_machine = None |
553 |
> + e_flags = None |
554 |
> + else: |
555 |
> + f.seek(E_MACHINE) |
556 |
> + e_machine = uint16(f.read(2)) |
557 |
> + |
558 |
> + # E_ENTRY + 3 * sizeof(uintN) |
559 |
> + e_flags_offset = E_ENTRY + 3 * width // 8 |
560 |
> + f.seek(e_flags_offset) |
561 |
> + e_flags = uint32(f.read(4)) |
562 |
> + |
563 |
> + obj = cls() |
564 |
> + obj.e_flags = e_flags |
565 |
> + obj.e_machine = e_machine |
566 |
> + obj.ei_class = ei_class |
567 |
> + obj.ei_data = ei_data |
568 |
> + |
569 |
> + return obj |
570 |
|
571 |
Looks good. I'm going to perf test this but I don't think it will be |
572 |
too big a hit. I don't know how we would get to this point in the code |
573 |
tree, but let me ask you this: am I right in thinking you won't hit |
574 |
ELFHeader.read() with every file that's being installed by portage? |
575 |
You'll only get here for elf objects? Correct? |
576 |
|
577 |
> diff --git a/pym/portage/util/endian/__init__.py b/pym/portage/util/endian/__init__.py |
578 |
> new file mode 100644 |
579 |
> index 0000000..4725d33 |
580 |
> --- /dev/null |
581 |
> +++ b/pym/portage/util/endian/__init__.py |
582 |
> @@ -0,0 +1,2 @@ |
583 |
> +# Copyright 2015 Gentoo Foundation |
584 |
> +# Distributed under the terms of the GNU General Public License v2 |
585 |
> diff --git a/pym/portage/util/endian/decode.py b/pym/portage/util/endian/decode.py |
586 |
> new file mode 100644 |
587 |
> index 0000000..ec0dcec |
588 |
> --- /dev/null |
589 |
> +++ b/pym/portage/util/endian/decode.py |
590 |
> @@ -0,0 +1,56 @@ |
591 |
> +# Copyright 2015 Gentoo Foundation |
592 |
> +# Distributed under the terms of the GNU General Public License v2 |
593 |
> + |
594 |
> +def decode_uint16_be(data): |
595 |
> + """ |
596 |
> + Decode an unsigned 16-bit integer with big-endian encoding. |
597 |
> + |
598 |
> + @param data: string of bytes of length 2 |
599 |
> + @type data: bytes |
600 |
> + @rtype: int |
601 |
> + @return: unsigned integer value of the decoded data |
602 |
> + """ |
603 |
> + return (ord(data[0:1]) << 8) + ord(data[1:2]) |
604 |
> + |
605 |
> +def decode_uint16_le(data): |
606 |
> + """ |
607 |
> + Decode an unsigned 16-bit integer with little-endian encoding. |
608 |
> + |
609 |
> + @param data: string of bytes of length 2 |
610 |
> + @type data: bytes |
611 |
> + @rtype: int |
612 |
> + @return: unsigned integer value of the decoded data |
613 |
> + """ |
614 |
> + return ord(data[0:1]) + (ord(data[1:2]) << 8) |
615 |
> + |
616 |
> +def decode_uint32_be(data): |
617 |
> + """ |
618 |
> + Decode an unsigned 32-bit integer with big-endian encoding. |
619 |
> + |
620 |
> + @param data: string of bytes of length 4 |
621 |
> + @type data: bytes |
622 |
> + @rtype: int |
623 |
> + @return: unsigned integer value of the decoded data |
624 |
> + """ |
625 |
> + return ( |
626 |
> + (ord(data[0:1]) << 24) + |
627 |
> + (ord(data[1:2]) << 16) + |
628 |
> + (ord(data[2:3]) << 8) + |
629 |
> + ord(data[3:4]) |
630 |
> + ) |
631 |
> + |
632 |
> +def decode_uint32_le(data): |
633 |
> + """ |
634 |
> + Decode an unsigned 32-bit integer with little-endian encoding. |
635 |
> + |
636 |
> + @param data: string of bytes of length 4 |
637 |
> + @type data: bytes |
638 |
> + @rtype: int |
639 |
> + @return: unsigned integer value of the decoded data |
640 |
> + """ |
641 |
> + return ( |
642 |
> + ord(data[0:1]) + |
643 |
> + (ord(data[1:2]) << 8) + |
644 |
> + (ord(data[2:3]) << 16) + |
645 |
> + (ord(data[3:4]) << 24) |
646 |
> + ) |
647 |
> |
648 |
|
649 |
Endian fun. |
650 |
|
651 |
|
652 |
-- |
653 |
Anthony G. Basile, Ph. D. |
654 |
Chair of Information Technology |
655 |
D'Youville College |
656 |
Buffalo, NY 14201 |
657 |
(716) 829-8197 |