Gentoo Archives: gentoo-dev

From: Zac Medico <zmedico@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [RFC] DIGESTS metadata variable for cache validation
Date: Mon, 09 Feb 2009 19:55:29
Message-Id: 49908A3D.4050403@gentoo.org
In Reply to: Re: [gentoo-dev] [RFC] DIGESTS metadata variable for cache validation by "Tiziano Müller"
1 -----BEGIN PGP SIGNED MESSAGE-----
2 Hash: SHA1
3
4 Tiziano Müller wrote:
5 > Am Samstag, den 07.02.2009, 15:23 -0800 schrieb Zac Medico:
6 >> -----BEGIN PGP SIGNED MESSAGE-----
7 >> Hash: SHA1
8 >>
9 >> Tiziano Müller wrote:
10 >>> Am Montag, den 02.02.2009, 12:34 -0800 schrieb Zac Medico:
11 >>>> For the digest format, I suggest that we use the leftmost 10
12 >>>> hexadecimal digits of the SHA-1 digest. The rationale for limiting
13 >>>> it to 10 digits (out of 40) is to save space. Due to the avalanche
14 >>>> effect [2], 10 digits should be sufficient to ensure that problems
15 >>>> resulting from hash collisions are extremely unlikely.
16 >>> I'd recommend to prefix the digest with a "{TYPE}" (like for hashed
17 >>> passwords) to be able to change the digest algorithm as needed
18 >>> (especially in regards to the current SHA successor competition).
19 >>> This allows a future package manager which might use SHA-3 for hashing
20 >>> (once it's released) to still check old digests. Furthermore it would
21 >>> allow for easier transition and only needs a definition of allowed
22 >>> hashes instead of a specific one.
23 >> I like that idea. That way it's not necessary to bump the EAPI in
24 >> order to change the hash function. So, a typical DIGESTS value might
25 >> look like this:
26 >>
27 >> SHA1 02021be38b a28b191904 3992945426 6ec21b29a3
28 >>
29 >>>> The primary reason to use a digest for cache validation instead of a
30 >>>> timestamp is that it allows the cache validation mechanism to work
31 >>>> even if the tree is distributed with a protocol that does not
32 >>>> preserve timestamps, such as git or subversion. This would make it
33 >>> Well, usually you don't keep intermediate or generated files in a VCS,
34 >>> so why the metadata?
35 >> People who distribute overlays commonly ask if it's possible to
36 >> distribute metadata cache with the overlay. Using a format that
37 >> doesn't rely on timestamps will allow them to distribute metadata
38 >> cache using their existing infrastructure, which is typically git or
39 >> subversion. In addition to overlays, it would also be useful for
40 >> forks of the entire gentoo tree, such as the funtoo tree [1].
41 >>
42 >> [1] http://github.com/funtoo/portage/tree/master
43 >
44 > Ok, after having the technical details discussed, I'd like to know which
45 > overlays or trees could really make use of it.
46 > Because small overlays surely won't generate the metadata because it is
47 > cumbersome to generate the metadata and isn't really a speed issue.
48 > Most larger overlays/repositories will probably be able to setup rsync
49 > or implement a procedure using cron+tarball.
50 > So, who exactly is asking about being able to distribute the metadata
51 > cache via a VCS?
52
53 All that I can say right now is that I recall questions about it in
54 the past from overlay maintainers (I don't have a list) and the
55 funtoo project is the only one which I can name offhand.
56
57 However, the ability to distribute cache via a vcs is only an
58 ancillary feature which is made possible by the DIGESTS data. The
59 DIGESTS data is useful regardless of the protocol that is used to
60 distribute the cache, since it allows the cache to be properly
61 validated for integrity. So, the real primary reason for introducing
62 the DIGESTS data is to provide a proper solution for cases like bug
63 #139134 [1] in which invalid metadata cache goes undetected.
64
65 [1] http://bugs.gentoo.org/show_bug.cgi?id=139134
66 - --
67 Thanks,
68 Zac
69 -----BEGIN PGP SIGNATURE-----
70 Version: GnuPG v2.0.9 (GNU/Linux)
71
72 iEYEARECAAYFAkmQijwACgkQ/ejvha5XGaM2gQCguhueRSzVSr6GlFpTW6uutJ9p
73 mAQAoJ5LOuU9kl8wXEF3qzF5XFa2LdmH
74 =DTgz
75 -----END PGP SIGNATURE-----

Replies

Subject Author
Re: [gentoo-dev] [RFC] DIGESTS metadata variable for cache validation Brian Harring <ferringb@×××××.com>