Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format
Date: Wed, 21 Nov 2018 11:20:43
Message-Id: 1542799232.30154.7.camel@gentoo.org
In Reply to: Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format by Fabian Groffen
1 On Wed, 2018-11-21 at 11:45 +0100, Fabian Groffen wrote:
2 > > > > > > 5. **Metadata is not compressed.** This is not a significant problem,
3 > > > > > > it is just listed for completeness.
4 > > > > > >
5 > > > > > >
6 > > > > > > Goals for a new container format
7 > > > > > > --------------------------------
8 > > > > > >
9 > > > > > > The following goals have been set for a replacement format:
10 > > > > > >
11 > > > > > > 1. **The packages must remain contained in a single file.** As a matter
12 > > > > > > of user convenience, it should be possible to transfer binary
13 > > > > > > packages without having to use multiple files, and to install them
14 > > > > > > from any location.
15 > > > > > >
16 > > > > > > 2. **The file format must be entirely based on common file formats,
17 > > > > > > respecting best practices, with as little customization as necessary
18 > > > > > > to satisfy the requirements.** In particular, it is unacceptable
19 > > > > > > to create new binary formats.
20 > > > > >
21 > > > > > I take this as your personal opinion. I don't quite get why it is
22 > > > > > unacceptable to create a new binary format though. In particular when
23 > > > > > you're looking for efficiency, such format could serve your purposes.
24 > > > > > As long as it's clearly defined, I don't see the problem with a binary
25 > > > > > format either.
26 > > > > > Could you add why it is you think binary formats are unacceptable here?
27 > > > >
28 > > > > Because custom binary formats require specialized tooling, and are
29 > > > > a royal PITA when the user wants to do something that the author of
30 > > > > specialized tooling just happened not to think worthwhile, or when
31 > > > > the tooling is not available for some reason. And before you ask really
32 > > > > silly questions, yes, I did fight binary packages over hex editor
33 > > > > at some point.
34 > > >
35 > > > Which I still don't understand, to be frank. I think even Portage
36 > > > exposes python APIs to get to the data.
37 > >
38 > > Compare the time needed to make a trivial (but unforeseen) change
39 > > on a format that's transparent vs a format that requires you to learn
40 > > its spec and/or API, write a program and debug it.
41 >
42 > I was under the impression you could unpack a tbz2 into data and xpak,
43 > then unpack both, modify the contents with an editor or whatever, and
44 > then pack the whole stuff back into a tbz2 again. This can be done
45 > worst case scenario by emerge -k <pkg>, modifying the vdb and quickpkg
46 > <pkg> afterwards.
47
48 In the described example, the whole necessity of modifying the binary
49 package arises from it being broken, therefore unsuitable for
50 'emerge -k'.
51
52 > I know that with portage-utils you can do this easily with the qtbz2 and
53 > qxpak commands. No need to do anything with a hex editor, or know
54 > anything about how it's done.
55
56 Actually, you need to:
57
58 a. know that portage-utils has the appropriate tools (it's non-obvious),
59
60 b. know how to use portage-utils.
61
62 This is non-obvious. It took me a while to figure out that I need to
63 use qtbz2 before using qxpak (why would it work only on split data when
64 the format is explicitly written to be used on top of compressed
65 archive?!).
66
67 > Obvious advantage of your approach is that you don't need q* tools, but
68 > can use tar instead. The editting is as trivial though. In your case
69 > you need a special procedure to reconstruct the binpkg should you want
70 > to keep your special properties (label, order) which equates to q* tools
71 > somewhat.
72
73 Except you don't need to keep them. The spec is quite explicit that
74 they're optimizations and that the package must work even if they're
75 lost as a part of editing exercise.
76
77 >
78 > > > > The most trivial case is an attempted recovery of a broken system.
79 > > > > If you don't have Portage working and don't have portage-utils
80 > > > > installed, do you really prefer a custom format which will require you
81 > > > > to fetch and compile special tools? Or is one that can be processed
82 > > > > with tools you're quite likely to have on every system, like tar?
83 > > >
84 > > > Well, I think the idea behind the original binpkg format was to use tar
85 > > > directly on the files in emergency scenarios like these...
86 > > > The assumption was bzip2 decompressor and tar being available.
87 > > > I think it is an example of how you add something, while still allowing
88 > > > to fallback on existing tools.
89 > >
90 > > Except progress in compressors has made it work less and less reliably.
91 > > It's mostly an example how to be *clever*. However, being clever
92 > > usually doesn't pay off in the long term, compared to doing things *in a
93 > > simple way*.
94 >
95 > We agree it is hackish, and we agree we can do without. You simply
96 > exaggerate the problem, IMO, which mostly isn't there, because it works
97 > fine today. It can also be solved today using shell tools.
98 >
99 > % head -c `grep -abo 'XPAKPACK' $EPREFIX/usr/portage/packages/sys-apps/sed-4.5.tbz2 | sed 's/:.*$//'` $EPREFIX/usr/portage/packages/sys-apps/sed-4.5.tbz2 | tar -jxf -
100 >
101 > results in no warnings/errors from bzip about trailing garbage, possible
102 > thanks to the spec being smart enough about this.
103
104 Well, you aren't going to call that simple, are you? Plus, I think your
105 solution would fail if bzip2 output just happened to contain 'XPAKPACK'
106 string. Not saying it's likely to happen but relying on fixed strings
107 not happening accidentally is not good design.
108
109 --
110 Best regards,
111 Michał Górny

Attachments

File name MIME type
signature.asc application/pgp-signature