Gentoo Archives: gentoo-dev

From: Zac Medico <zmedico@g.o>
To: Gentoo Dev <gentoo-dev@l.g.o>
Subject: [gentoo-dev] [RFC] DIGESTS metadata variable for cache validation
Date: Mon, 02 Feb 2009 20:34:34
Hash: SHA1


I'd like to add a new metadata cache value called DIGESTS which will
contain a space separated list of digests which can be
used to validate the metadata cache. Like INHERITED and
DEFINED_PHASES [1], it will be automatically generated. The first
digest in the list will correspond to the ebuild. If there are any
inherited eclasses, the digests of those eclasses will follow in a
space separated list, in the same order that they occur in the
INHERITED variable. The value of the DIGESTS variable will be on
line 18 of the metadata cache (just after DEFINED_PHASES).

For the digest format, I suggest that we use the leftmost 10
hexadecimal digits of the SHA-1 digest. The rationale for limiting
it to 10 digits (out of 40) is to save space. Due to the avalanche
effect [2], 10 digits should be sufficient to ensure that problems
resulting from hash collisions are extremely unlikely.

The primary reason to use a digest for cache validation instead of a
timestamp is that it allows the cache validation mechanism to work
even if the tree is distributed with a protocol that does not
preserve timestamps, such as git or subversion. This would make it
possible to distribute metadata cache directly from git and
subversion repositories (among others). Since a digest is inherently
more expensive to obtain than a timestamp, package managers may use
the Manifest entries as a digest cache, in order to avoid the need
to compute digests of ebuilds during dependency calculations.

Does the suggested approach seem reasonable? Would anybody like to
suggest any changes?

- --

Version: GnuPG v2.0.9 (GNU/Linux)