Gentoo Archives: gentoo-dev

From: Zac Medico <zmedico@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [RFC] DIGESTS metadata variable for cache validation
Date: Sat, 14 Feb 2009 20:15:39
Message-Id: 49972690.6080909@gentoo.org
In Reply to: Re: [gentoo-dev] [RFC] DIGESTS metadata variable for cache validation by Brian Harring
1 -----BEGIN PGP SIGNED MESSAGE-----
2 Hash: SHA1
3
4 Brian Harring wrote:
5 > On Wed, Feb 11, 2009 at 02:01:24AM -0800, Zac Medico wrote:
6 >> Brian Harring wrote:
7 >>> On Tue, Feb 10, 2009 at 12:55:51PM -0800, Zac Medico wrote:
8 >>>> Brian Harring wrote:
9 >>>>> Frankly, forget compatibility- the current format could stand to die.
10 >>>>> The repository format is an ever growing mess- leave it as is and
11 >>>>> work on cutting over to something sane.
12 >>>> Changing the repository layout is a pretty radical thing to do.
13 >>>> You're welcome to start a new subject for that if you'd like but I'd
14 >>>> prefer to keep the scope of this thread focussed on the cache format
15 >>>> for the existing repository layout.
16 >> I don't intend to repeal the cache mtime requirement, at least
17 >> (especially) not on gentoo's rsync tree. However, I wouldn't say
18 >> that it's something that necessarily needs to be a requirement for
19 >> other repositories or overlays, moving forward (assuming that an
20 >> alternative validation framework is in place).
21 >
22 > So... you want a subset of repositories to have cache algo x, while
23 > the rest have the old algo. And since the repo w/ algo x isn't
24 > marked in some fashion, all managers will have to use new algo x for
25 > compatibility reasons. Right...
26
27 Clients using either validation mechanism can consume the same
28 cache. If the client recognizes DIGESTS data and it's available in a
29 given cache entry, naturally the client should prefer the DIGESTS
30 validation mechanism because it's more reliable.
31
32 >>> I reiterate, this belongs in a seperate repository format, along w/
33 >>> the rest of the unversioned repository changes you've been pushing in
34 >>> (profile package.mask breaking all non portage PMs is a perfect
35 >>> example).
36 >> The package.mask thing is a separate discussion. Let's do that in a
37 >> separate thread.
38 >
39 > Package.mask is relevant purely as a demonstration of why unversioned
40 > changes to the repository formats *needs* to stop. Generally speaking
41 > it's pretty shitty behaviour to embrace/extend a format when others
42 > rely on it for interop.
43
44 I agree that it's a poor practice to change the format in ways that
45 are not inter-operable. However, as said above, introduction of the
46 DIGESTS data is inter-operable.
47
48 > The annoying thing about this thread is that *no where* am I saying
49 > you shouldn't be free to experiment. All I'm stating is that the end
50 > result isn't a compatible repo- it *is* a new format (version even)
51 > thus mark it in some way so that the rest of us can start properly
52 > handling it rather then having to cut last minute releases since we're
53 > PMS compliant but portage treats PMS as a subset of it's format rules.
54
55 As said, the end result of introducing the DIGESTS data _is_ a
56 compatible repo.
57
58 > Pretty simple request, and not something that shouuld require argument
59 > as far as I'm concerned.
60
61 >>> The daft thing about this is that w/ effectively atomic sync (if the
62 >>> sync fails then mark the repo as screwed up till a sync completes),
63 >>> the current cache format can *still* do validation- no clue if
64 >>> paludis has it, but at least pkgcore and portage can handle this via
65 >>> awareness of the eclass stacking.
66 >> I want to have a more fault-tolerant solution than that.
67 >
68 > I understand your reasoning, and frankly I used to view the rsync
69 > issue in the same way- it's a naive view however since it implicitly
70 > is assuming that the resultant repo is *usable*, iow that the actual
71 > ebuild/eclass/profile data is valid, just that the updating bailed
72 > during metadata transfer. There is zero gurantee as to where the
73 > rsync bailed- meaning you can be missing patches, have trashed
74 > manifests, etc.
75 >
76 > Well aware it's not friendly to require people to force a completed
77 > sync before being able to use the repo, but it really is the only
78 > *safe* option- as such the fault tolerant counterarg is a non
79 > arguement.
80
81 Problems aren't only triggered by sync issues. For example, suppose
82 that the user has locally modified an eclass in a way that results
83 in a metadata change. The DIGESTS data will provide enough
84 information to detect cases such as this. Without this data, the
85 user may be left scratching their head, wondering why their eclass
86 change hasn't been accounted for.
87
88 >>> Note that proper PM implementations *still* have to set the cache
89 >>> entries mtime for backwards compatibility w/ older PMs that don't
90 >>> support this new unversioned change thus muddying the implementation
91 >>> even further.
92 >> As said above, I wasn't intending that, at least (especially) not
93 >> for gentoo's rsync tree. I guess you got that idea from the mention
94 >> of bug 139134, but you don't need to worry about it.
95 >
96 > Implicitly it's required; if pkgcore is to generate cache entries for
97 > repo x, it has to do exactly as I said so that any any pre
98 > cache-modified-managers are still able to use the cache. That's
99 > assuming the $PM cares about compatibility...
100
101 As said, clients using either validation mechanism can consume the
102 same cache, so introduction of the DIGESTS data will be fully
103 inter-operable.
104 - --
105 Thanks,
106 Zac
107
108 -----BEGIN PGP SIGNATURE-----
109 Version: GnuPG v2.0.9 (GNU/Linux)
110
111 iEYEARECAAYFAkmXJo4ACgkQ/ejvha5XGaNG7wCgwnOtEKD8VuKLAjFyahAJpQIJ
112 HWAAn2woN2CxmvAXu5ir0/N7ZvJLrAbc
113 =NVzc
114 -----END PGP SIGNATURE-----