Gentoo Archives: gentoo-dev

From: "Tiziano Müller" <dev-zero@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [RFC] DIGESTS metadata variable for cache validation
Date: Sat, 07 Feb 2009 22:32:06
Message-Id: 1234045916.24784.1373.camel@localhost
In Reply to: [gentoo-dev] [RFC] DIGESTS metadata variable for cache validation by Zac Medico
1 Am Montag, den 02.02.2009, 12:34 -0800 schrieb Zac Medico:
3 > Hash: SHA1
4 >
5 > Hi,
6 >
7 > I'd like to add a new metadata cache value called DIGESTS which will
8 > contain a space separated list of digests which can be
9 > used to validate the metadata cache. Like INHERITED and
10 > DEFINED_PHASES [1], it will be automatically generated. The first
11 > digest in the list will correspond to the ebuild. If there are any
12 > inherited eclasses, the digests of those eclasses will follow in a
13 > space separated list, in the same order that they occur in the
14 > INHERITED variable. The value of the DIGESTS variable will be on
15 > line 18 of the metadata cache (just after DEFINED_PHASES).
16 >
17 > For the digest format, I suggest that we use the leftmost 10
18 > hexadecimal digits of the SHA-1 digest. The rationale for limiting
19 > it to 10 digits (out of 40) is to save space. Due to the avalanche
20 > effect [2], 10 digits should be sufficient to ensure that problems
21 > resulting from hash collisions are extremely unlikely.
22 I'd recommend to prefix the digest with a "{TYPE}" (like for hashed
23 passwords) to be able to change the digest algorithm as needed
24 (especially in regards to the current SHA successor competition).
25 This allows a future package manager which might use SHA-3 for hashing
26 (once it's released) to still check old digests. Furthermore it would
27 allow for easier transition and only needs a definition of allowed
28 hashes instead of a specific one.
30 >
31 > The primary reason to use a digest for cache validation instead of a
32 > timestamp is that it allows the cache validation mechanism to work
33 > even if the tree is distributed with a protocol that does not
34 > preserve timestamps, such as git or subversion. This would make it
35 Well, usually you don't keep intermediate or generated files in a VCS,
36 so why the metadata?
38 > possible to distribute metadata cache directly from git and
39 > subversion repositories (among others). Since a digest is inherently
40 > more expensive to obtain than a timestamp, package managers may use
41 > the Manifest entries as a digest cache, in order to avoid the need
42 > to compute digests of ebuilds during dependency calculations.
43 >
44 > Does the suggested approach seem reasonable? Would anybody like to
45 > suggest any changes?
47 Cheers,
48 Tiziano
50 --
51 -------------------------------------------------------
52 Tiziano Müller
53 Gentoo Linux Developer, Council Member
54 Areas of responsibility:
55 Samba, PostgreSQL, CPP, Python, sysadmin
56 E-Mail : dev-zero@g.o
57 GnuPG FP : F327 283A E769 2E36 18D5 4DE2 1B05 6A63 AE9C 1E30


File name MIME type
signature.asc application/pgp-signature