1 |
On Mon, Feb 09, 2009 at 11:55:41AM -0800, Zac Medico wrote: |
2 |
> All that I can say right now is that I recall questions about it in |
3 |
> the past from overlay maintainers (I don't have a list) and the |
4 |
> funtoo project is the only one which I can name offhand. |
5 |
> |
6 |
> However, the ability to distribute cache via a vcs is only an |
7 |
> ancillary feature which is made possible by the DIGESTS data. The |
8 |
> DIGESTS data is useful regardless of the protocol that is used to |
9 |
> distribute the cache, since it allows the cache to be properly |
10 |
> validated for integrity. So, the real primary reason for introducing |
11 |
> the DIGESTS data is to provide a proper solution for cases like bug |
12 |
> #139134 [1] in which invalid metadata cache goes undetected. |
13 |
|
14 |
I'm sorry, but this proposal smells something awful. Because of the |
15 |
mtime requirement on cache entries you're proposing jamming another |
16 |
1.4MB into the cache for validation purposes (which should be 4x that |
17 |
since a full checksum really should be in there) while trying to |
18 |
maintain compatibility. |
19 |
|
20 |
Frankly, forget compatibility- the current format could stand to die. |
21 |
The repository format is an ever growing mess- leave it as is and |
22 |
work on cutting over to something sane. |
23 |
|
24 |
Overlay maintainers who want the latest/greatest obviously can convert |
25 |
over also; one would hope their would be enough cleanup to make it |
26 |
worth their time. |
27 |
|
28 |
As for the nasty gentoo-x86 compatibility, basically, do the |
29 |
following: |
30 |
|
31 |
1) maintain the existing cvs repo as is |
32 |
2) iron out what cleanup/restructuring is desired. glep55 being |
33 |
jammed in here is a potential for example. Nail down the new repo |
34 |
format basically (with an eye for translating the cvs repo to it on |
35 |
the fly). |
36 |
3) use an eclass index holding the checksums, w/ the cache entries |
37 |
referencing the index numbers rather (sorting the index by |
38 |
consumption, meaning the more ebuilds using it the lower the index): |
39 |
this brings the cache addition down to around 285KB (acceptable imo) |
40 |
while giving full flexibility in the checksums available for eclasses. |
41 |
This is assuming the current flat_list format is still in use in the |
42 |
new repo... |
43 |
4) drop mtime on cache entries, bump it forward whenever it's updated |
44 |
(bug 139134 goes away) jamming in an ebuild checksum of some sort. |
45 |
5) rsync nodes are required to have 10GB of storage available- so |
46 |
storage shouldn't be an issue, but ensuring all nodes have been |
47 |
updated to sync both the old and *new* format is required. |
48 |
6) suffer through cvs for a year (or whatever time frame), converting |
49 |
folks over to the new url. |
50 |
7) kill the old format after whatever period deemed best (potentially |
51 |
leaving a README telling folks how to update if they're seriously |
52 |
behind). |
53 |
8) convert the cvs repo to the new format, tear down the |
54 |
transformation bits. |
55 |
|
56 |
Yes, the plan above is coarse- there aren't any glaring holes as far |
57 |
as I can see however. It does place restrictions on the repo format |
58 |
choosen, but careful choices in the new format (heavy format |
59 |
versioning) should make it possible to make this sort of issue less |
60 |
of a pain down the line. |
61 |
|
62 |
|
63 |
At the very least, doing a different repo format for repos/overlays |
64 |
stored in a vcs that doesn't track mtime would solve their issues- it |
65 |
also has the nice benefit of not making the repo more bloated for the |
66 |
99% of folk who didn't even hit the issues spawning this. |
67 |
|
68 |
If gentoo-x86 is left as is, bug 139134 can be head off w/out jamming |
69 |
a new metadata key in; to be clear, I'm likely going to "Special Hell" |
70 |
for suggesting this but if mtime/size on the new cache entry is the |
71 |
same size as old, append a space to the value in the description |
72 |
field. |
73 |
|
74 |
All sane managers ought to be doing basic clean up of that value |
75 |
anyways in their data layer (let alone at the UI level), but it's |
76 |
enough to make rsync behave. |
77 |
|
78 |
So... flame away. |
79 |
|
80 |
~brian |