1 |
Am Montag, den 02.02.2009, 12:34 -0800 schrieb Zac Medico: |
2 |
> -----BEGIN PGP SIGNED MESSAGE----- |
3 |
> Hash: SHA1 |
4 |
> |
5 |
> Hi, |
6 |
> |
7 |
> I'd like to add a new metadata cache value called DIGESTS which will |
8 |
> contain a space separated list of digests which can be |
9 |
> used to validate the metadata cache. Like INHERITED and |
10 |
> DEFINED_PHASES [1], it will be automatically generated. The first |
11 |
> digest in the list will correspond to the ebuild. If there are any |
12 |
> inherited eclasses, the digests of those eclasses will follow in a |
13 |
> space separated list, in the same order that they occur in the |
14 |
> INHERITED variable. The value of the DIGESTS variable will be on |
15 |
> line 18 of the metadata cache (just after DEFINED_PHASES). |
16 |
> |
17 |
> For the digest format, I suggest that we use the leftmost 10 |
18 |
> hexadecimal digits of the SHA-1 digest. The rationale for limiting |
19 |
> it to 10 digits (out of 40) is to save space. Due to the avalanche |
20 |
> effect [2], 10 digits should be sufficient to ensure that problems |
21 |
> resulting from hash collisions are extremely unlikely. |
22 |
I'd recommend to prefix the digest with a "{TYPE}" (like for hashed |
23 |
passwords) to be able to change the digest algorithm as needed |
24 |
(especially in regards to the current SHA successor competition). |
25 |
This allows a future package manager which might use SHA-3 for hashing |
26 |
(once it's released) to still check old digests. Furthermore it would |
27 |
allow for easier transition and only needs a definition of allowed |
28 |
hashes instead of a specific one. |
29 |
|
30 |
> |
31 |
> The primary reason to use a digest for cache validation instead of a |
32 |
> timestamp is that it allows the cache validation mechanism to work |
33 |
> even if the tree is distributed with a protocol that does not |
34 |
> preserve timestamps, such as git or subversion. This would make it |
35 |
Well, usually you don't keep intermediate or generated files in a VCS, |
36 |
so why the metadata? |
37 |
|
38 |
> possible to distribute metadata cache directly from git and |
39 |
> subversion repositories (among others). Since a digest is inherently |
40 |
> more expensive to obtain than a timestamp, package managers may use |
41 |
> the Manifest entries as a digest cache, in order to avoid the need |
42 |
> to compute digests of ebuilds during dependency calculations. |
43 |
> |
44 |
> Does the suggested approach seem reasonable? Would anybody like to |
45 |
> suggest any changes? |
46 |
|
47 |
Cheers, |
48 |
Tiziano |
49 |
|
50 |
-- |
51 |
------------------------------------------------------- |
52 |
Tiziano Müller |
53 |
Gentoo Linux Developer, Council Member |
54 |
Areas of responsibility: |
55 |
Samba, PostgreSQL, CPP, Python, sysadmin |
56 |
E-Mail : dev-zero@g.o |
57 |
GnuPG FP : F327 283A E769 2E36 18D5 4DE2 1B05 6A63 AE9C 1E30 |