1 |
Am Sonntag, den 08.02.2009, 00:59 -0800 schrieb Zac Medico: |
2 |
> -----BEGIN PGP SIGNED MESSAGE----- |
3 |
> Hash: SHA1 |
4 |
> |
5 |
> Tiziano Müller wrote: |
6 |
> > Am Samstag, den 07.02.2009, 15:23 -0800 schrieb Zac Medico: |
7 |
> >> -----BEGIN PGP SIGNED MESSAGE----- |
8 |
> >> Hash: SHA1 |
9 |
> >> |
10 |
> >> Tiziano Müller wrote: |
11 |
> >>> Am Montag, den 02.02.2009, 12:34 -0800 schrieb Zac Medico: |
12 |
> >>>> For the digest format, I suggest that we use the leftmost 10 |
13 |
> >>>> hexadecimal digits of the SHA-1 digest. The rationale for limiting |
14 |
> >>>> it to 10 digits (out of 40) is to save space. Due to the avalanche |
15 |
> >>>> effect [2], 10 digits should be sufficient to ensure that problems |
16 |
> >>>> resulting from hash collisions are extremely unlikely. |
17 |
> >>> I'd recommend to prefix the digest with a "{TYPE}" (like for hashed |
18 |
> >>> passwords) to be able to change the digest algorithm as needed |
19 |
> >>> (especially in regards to the current SHA successor competition). |
20 |
> >>> This allows a future package manager which might use SHA-3 for hashing |
21 |
> >>> (once it's released) to still check old digests. Furthermore it would |
22 |
> >>> allow for easier transition and only needs a definition of allowed |
23 |
> >>> hashes instead of a specific one. |
24 |
> >> I like that idea. That way it's not necessary to bump the EAPI in |
25 |
> >> order to change the hash function. So, a typical DIGESTS value might |
26 |
> >> look like this: |
27 |
You still have to bump the EAPI in case you want to use a new hash not |
28 |
already available now (like SHA-3). The advantage of noting the used |
29 |
hash is that new PMs can handle old metadata cache. |
30 |
|
31 |
> >> |
32 |
> >> SHA1 02021be38b a28b191904 3992945426 6ec21b29a3 |
33 |
> > |
34 |
> > Sleeping over it again I don't think that truncating a hash is a good |
35 |
> > idea (truncating it from 40 to 10 digits makes the possibility of |
36 |
> > collisions much much higher). |
37 |
> |
38 |
> The probability of collision is much higher, but it's still |
39 |
> relatively small. Given the "avalanche effect" that is typical of |
40 |
> cryptographic hash functions, it's extremely unlikely that collision |
41 |
> will occur in such a way that it will cause a problem for cache |
42 |
> validation. |
43 |
The "avalanche effect" as I understood it is required for a hash |
44 |
function to avoid simple calculations of collisions (what the diffusion |
45 |
is for crypto algorithms). So, small changes should affect as many |
46 |
numbers in the hash as possible. But you don't have only small changes |
47 |
here in case somebody patches an eclass, so, the only thing which counts |
48 |
is the probability of a collision. |
49 |
|
50 |
> |
51 |
> > But if you want to go this way, I'd say you should use something like |
52 |
> > SHA1t (t for truncated) to make sure we can use full hashes once we feel |
53 |
> > it's appropriate. |
54 |
> |
55 |
> We could, but I think SHA1 would also be fine since one can infer |
56 |
> from the length of the string that it's been truncated. |
57 |
No, guessing is a bad thing here because it could be truncated because |
58 |
of faulty metadata. But the main motivation is that if you write SHA1 |
59 |
everyone reading it expects it to be a full SHA1 hash, which it isn't. |
60 |
|
61 |
But if your target is to reduce the size of the metadata cache, why |
62 |
store the hashes of the eclasses in the ebuild's metadata and not in a |
63 |
seperate dir? They have to be the same for every ebuild, don't they? |
64 |
In case you have an average number of eclasses which is bigger than 4, |
65 |
you can even store the full hash with less space used than with |
66 |
truncated hashes for all eclasses. |
67 |
|
68 |
-- |
69 |
------------------------------------------------------- |
70 |
Tiziano Müller |
71 |
Gentoo Linux Developer, Council Member |
72 |
Areas of responsibility: |
73 |
Samba, PostgreSQL, CPP, Python, sysadmin |
74 |
E-Mail : dev-zero@g.o |
75 |
GnuPG FP : F327 283A E769 2E36 18D5 4DE2 1B05 6A63 AE9C 1E30 |