1 |
On Wed, 2018-11-21 at 11:45 +0100, Fabian Groffen wrote: |
2 |
> > > > > > 5. **Metadata is not compressed.** This is not a significant problem, |
3 |
> > > > > > it is just listed for completeness. |
4 |
> > > > > > |
5 |
> > > > > > |
6 |
> > > > > > Goals for a new container format |
7 |
> > > > > > -------------------------------- |
8 |
> > > > > > |
9 |
> > > > > > The following goals have been set for a replacement format: |
10 |
> > > > > > |
11 |
> > > > > > 1. **The packages must remain contained in a single file.** As a matter |
12 |
> > > > > > of user convenience, it should be possible to transfer binary |
13 |
> > > > > > packages without having to use multiple files, and to install them |
14 |
> > > > > > from any location. |
15 |
> > > > > > |
16 |
> > > > > > 2. **The file format must be entirely based on common file formats, |
17 |
> > > > > > respecting best practices, with as little customization as necessary |
18 |
> > > > > > to satisfy the requirements.** In particular, it is unacceptable |
19 |
> > > > > > to create new binary formats. |
20 |
> > > > > |
21 |
> > > > > I take this as your personal opinion. I don't quite get why it is |
22 |
> > > > > unacceptable to create a new binary format though. In particular when |
23 |
> > > > > you're looking for efficiency, such format could serve your purposes. |
24 |
> > > > > As long as it's clearly defined, I don't see the problem with a binary |
25 |
> > > > > format either. |
26 |
> > > > > Could you add why it is you think binary formats are unacceptable here? |
27 |
> > > > |
28 |
> > > > Because custom binary formats require specialized tooling, and are |
29 |
> > > > a royal PITA when the user wants to do something that the author of |
30 |
> > > > specialized tooling just happened not to think worthwhile, or when |
31 |
> > > > the tooling is not available for some reason. And before you ask really |
32 |
> > > > silly questions, yes, I did fight binary packages over hex editor |
33 |
> > > > at some point. |
34 |
> > > |
35 |
> > > Which I still don't understand, to be frank. I think even Portage |
36 |
> > > exposes python APIs to get to the data. |
37 |
> > |
38 |
> > Compare the time needed to make a trivial (but unforeseen) change |
39 |
> > on a format that's transparent vs a format that requires you to learn |
40 |
> > its spec and/or API, write a program and debug it. |
41 |
> |
42 |
> I was under the impression you could unpack a tbz2 into data and xpak, |
43 |
> then unpack both, modify the contents with an editor or whatever, and |
44 |
> then pack the whole stuff back into a tbz2 again. This can be done |
45 |
> worst case scenario by emerge -k <pkg>, modifying the vdb and quickpkg |
46 |
> <pkg> afterwards. |
47 |
|
48 |
In the described example, the whole necessity of modifying the binary |
49 |
package arises from it being broken, therefore unsuitable for |
50 |
'emerge -k'. |
51 |
|
52 |
> I know that with portage-utils you can do this easily with the qtbz2 and |
53 |
> qxpak commands. No need to do anything with a hex editor, or know |
54 |
> anything about how it's done. |
55 |
|
56 |
Actually, you need to: |
57 |
|
58 |
a. know that portage-utils has the appropriate tools (it's non-obvious), |
59 |
|
60 |
b. know how to use portage-utils. |
61 |
|
62 |
This is non-obvious. It took me a while to figure out that I need to |
63 |
use qtbz2 before using qxpak (why would it work only on split data when |
64 |
the format is explicitly written to be used on top of compressed |
65 |
archive?!). |
66 |
|
67 |
> Obvious advantage of your approach is that you don't need q* tools, but |
68 |
> can use tar instead. The editting is as trivial though. In your case |
69 |
> you need a special procedure to reconstruct the binpkg should you want |
70 |
> to keep your special properties (label, order) which equates to q* tools |
71 |
> somewhat. |
72 |
|
73 |
Except you don't need to keep them. The spec is quite explicit that |
74 |
they're optimizations and that the package must work even if they're |
75 |
lost as a part of editing exercise. |
76 |
|
77 |
> |
78 |
> > > > The most trivial case is an attempted recovery of a broken system. |
79 |
> > > > If you don't have Portage working and don't have portage-utils |
80 |
> > > > installed, do you really prefer a custom format which will require you |
81 |
> > > > to fetch and compile special tools? Or is one that can be processed |
82 |
> > > > with tools you're quite likely to have on every system, like tar? |
83 |
> > > |
84 |
> > > Well, I think the idea behind the original binpkg format was to use tar |
85 |
> > > directly on the files in emergency scenarios like these... |
86 |
> > > The assumption was bzip2 decompressor and tar being available. |
87 |
> > > I think it is an example of how you add something, while still allowing |
88 |
> > > to fallback on existing tools. |
89 |
> > |
90 |
> > Except progress in compressors has made it work less and less reliably. |
91 |
> > It's mostly an example how to be *clever*. However, being clever |
92 |
> > usually doesn't pay off in the long term, compared to doing things *in a |
93 |
> > simple way*. |
94 |
> |
95 |
> We agree it is hackish, and we agree we can do without. You simply |
96 |
> exaggerate the problem, IMO, which mostly isn't there, because it works |
97 |
> fine today. It can also be solved today using shell tools. |
98 |
> |
99 |
> % head -c `grep -abo 'XPAKPACK' $EPREFIX/usr/portage/packages/sys-apps/sed-4.5.tbz2 | sed 's/:.*$//'` $EPREFIX/usr/portage/packages/sys-apps/sed-4.5.tbz2 | tar -jxf - |
100 |
> |
101 |
> results in no warnings/errors from bzip about trailing garbage, possible |
102 |
> thanks to the spec being smart enough about this. |
103 |
|
104 |
Well, you aren't going to call that simple, are you? Plus, I think your |
105 |
solution would fail if bzip2 output just happened to contain 'XPAKPACK' |
106 |
string. Not saying it's likely to happen but relying on fixed strings |
107 |
not happening accidentally is not good design. |
108 |
|
109 |
-- |
110 |
Best regards, |
111 |
Michał Górny |