1 |
Hi, |
2 |
|
3 |
This mail was first sent to dev-portage so it's written for that |
4 |
audience, but it should be understandable for normal devs as well ;) |
5 |
|
6 |
Short summary: current portage versions won't be able to handle any |
7 |
modification to the digest format so we have to find a different way if |
8 |
we want support for SHA1 or other algorithms. |
9 |
|
10 |
|
11 |
And now the more detailed mail: |
12 |
|
13 |
As was discussed again on -dev recently we need more digest algorithms |
14 |
for file verification. One way that would be halfway compatible would be |
15 |
to add additional lines use the same syntax as for the current md5 |
16 |
checksums to the digests and Manifests. However that means a lot of |
17 |
redundancy as for each additional algorithm the filename and filesize |
18 |
would be duplicated. It's also not trivial to do as there are several |
19 |
functions dealing with digests and they all parse them a bit different |
20 |
(I tried to add SHA1 support for digests and Manifests, took me about an |
21 |
hour before I gave up). Also as soon as we add non-MD5 lines to digests |
22 |
all currently released portage versions will blow up (as they will treat |
23 |
the provided hash as a MD5 value, call it a bug if you want). |
24 |
|
25 |
Instead I suggest we completely reorganize the digest system from |
26 |
scratch by unifying the digests and the manifest files. As you all know |
27 |
our tree is getting bigger and bigger with no end in sight. That |
28 |
combined with the usual filesystem overhead causes a lot of wasted space |
29 |
on many systems. By unifying the digests with the Manifests we could |
30 |
kill >15.000 very small files at once (in the long run, this would |
31 |
require compatible portage versions for all users). |
32 |
|
33 |
As for the new syntax, it should allow us to add new digest algorithms |
34 |
to portage without changing the syntax. My current idea would be that |
35 |
for each file in the tree and in SRC_URI we have a line specifying: |
36 |
- the filename |
37 |
- the filesize |
38 |
- n digests (consisting of algorithmname and the checksum) |
39 |
To maintain compability and support future enhancements each of these |
40 |
lines has to be prefixed with a (set of) keyword(s) (FILE or DIGEST or |
41 |
SRC_URI,EBULD,AUXFILE). |
42 |
Example lines could be: |
43 |
|
44 |
SRC_URI portage-2.0.51_rc7.tar.bz2 274572 MD5 1234 SHA1 abcd RMD160 9876 |
45 |
EBUILD portage-2.0.51_rc7.ebuild 11806 MD5 xyz SHA1 fifteen |
46 |
|
47 |
(using fake checksums for readability). |
48 |
|
49 |
Maybe the system can also be extended to incorporate GLEP 25 without |
50 |
adding a ton of new files, I'd need some input from Brian on that issue. |
51 |
|
52 |
The biggest problem for this proposal is of course compability, a rough |
53 |
transition plan could be: |
54 |
- keep digests as they are now |
55 |
- add the new format to Manifests (additional to the current MD5 lines) |
56 |
- support the new format in 2.0.52 (use it optionally for verification) |
57 |
- use it for verification in 2.1 by default (and drop support for the |
58 |
old system) |
59 |
- exclude the old digests from `emerge --sync` in 2.1 |
60 |
|
61 |
And finally a summarizing list of reasons for the format: |
62 |
- keep all checksums of a package in one place |
63 |
- removes one level of indirection for signing |
64 |
- digest generation currently recreates the Manifest anyway |
65 |
- removing files from the tree |
66 |
- allows for easy addition of new digest algorithms |
67 |
- any syntax modification to the current digest files brings compability |
68 |
problems with all currently existing portage versions while Manifest |
69 |
changes do not |
70 |
- potential to discover file collisions easier (currently you can have |
71 |
the same file in two digests with different checksums, not a real |
72 |
problem yet though) |
73 |
- removes redundancy for common files |
74 |
|
75 |
Let the discussions begin. |
76 |
|
77 |
Marius |