1 |
Cross-posting to scm; responses should go to scm please (and the |
2 |
people who whinge about cross posting should go promptly to hell if |
3 |
I have any say in the matter). |
4 |
|
5 |
On Mon, Oct 01, 2012 at 05:58:43PM -0700, Diego Elio Petten?? wrote: |
6 |
> On 01/10/2012 17:51, Gregory M. Turner wrote: |
7 |
> > |
8 |
> > Anyhow, I get it: administering the vcs for a huge project such as |
9 |
> > Gentoo is very hard work. If I somehow gave some other impression, I'm |
10 |
> > sorry. Perhaps Rich and I insensitively voiced our shared assumption |
11 |
> > that Gentoo's continued reliance on cvs stems from a lack of motivation |
12 |
> > and consensus, rather than a shortage of labor and resources. |
13 |
> |
14 |
> That's definitely not the case. While we do have had some complains |
15 |
> (mostly from Prefix last I knew) about git's working, the consensus for |
16 |
> going to git is there. The problems are vastly technical. |
17 |
> |
18 |
> Problems such as "how many developers would be fine with having to |
19 |
> checkout 2GB of history to be able to commit"? git support shallow |
20 |
> clones but not if you want to commit to them. |
21 |
|
22 |
Few corrections; |
23 |
1) You can commit to shallow clones. You can actually push from them |
24 |
too- you just have to know what you're doing (your parent *has* to be |
25 |
known to the other side, else you're trying to push a disconnected |
26 |
history/graph to the other side, which doesn't know how to connect the |
27 |
two). We won't be doing that fortunately, just noting that it is |
28 |
possible if you're careful (and I know what the man page says; what |
29 |
I'm saying is the full version, rather than the short version they |
30 |
list there). |
31 |
|
32 |
2) graft's are what we'll be doing there; kind of shallow, but now. |
33 |
Basically the same thing the kernel folk did. |
34 |
|
35 |
|
36 |
As for the "quit your bitching and contribute already" rant angle; |
37 |
Diego's accurate; minimally, it's more productive to contribute and |
38 |
you're less likely to crap on folks motivation, let alone risk the |
39 |
wraith of a pissy person like me yelling at you. |
40 |
|
41 |
Here in is the kicker; certain chunks of this can't be handled by |
42 |
random joe blow off the street- they require core infra access. |
43 |
|
44 |
Bluntly (no disrespect to people, just being brutally direct) I don't |
45 |
care if you have infra friends, I don't care if you maintain a couple |
46 |
of boxes; if you're doing heavy OPs in a production environment, |
47 |
you'll understand the issue of trust/access- thus you'll understand |
48 |
that some of this work, cannot be done by anyone but infra. |
49 |
|
50 |
Like it or not, very few people have access to the core cvs -> rsync |
51 |
hosts/machinery- since each/every/one/of/us means it's a security |
52 |
angle that has to be tracked. That's not arguable, so don't even try |
53 |
please. |
54 |
|
55 |
That said, there are non-infra contributions people can make. |
56 |
|
57 |
I suggest people do that; here's the list off the top of my head |
58 |
(these are things worst case, I'll sort- which means it'll be months |
59 |
out till I finish them considering my own time constraints and focus |
60 |
on getting eapi5 support into pkgcore first). |
61 |
|
62 |
0) First the rules of the road for this discussion; assume that I'll |
63 |
be bitchy if you violate this. |
64 |
|
65 |
0.a) We're not dropping the existing history. Suggesting this is |
66 |
asking for a killfile entry, it's viable for small or throw-away |
67 |
projects; gentoo-x86 cvs repository is not a throw-away project. |
68 |
|
69 |
0.b) Lesser offence since it's not obvious; the various suggestions |
70 |
that we just snapshot this, then try to fix history after the fact |
71 |
won't work- look into git's transitive trust via sha1's of the |
72 |
parent's sha1. To do that sort of proposal means forcing a full |
73 |
history rewrite down the line; this doesn't fly. |
74 |
|
75 |
0.c) For whatever I've missed, assume that if it craps on developers |
76 |
workflow... it's a no go, and needs to be addressed. Does CVS suck? |
77 |
Yes, I hate having to use it. But it *works*; switching to git has to |
78 |
be, minimally, a lateral move for developers in terms of their |
79 |
workflow- we cannot make it worse else what's the point of this whole |
80 |
exercise? There may be an exception or two here- things that aren't |
81 |
sorted immediately upon conversion, but those exceptions will only fly |
82 |
if they're minor, don't require history rewrites, and someone is |
83 |
locked in/guranteed to be working on it now (else we have no gurantee |
84 |
it'll actually be sorted). |
85 |
|
86 |
|
87 |
1) We need a thin manifest -> thick manifest converter. Thin |
88 |
manifests are used for git- they store just DIST entries. Thick (also |
89 |
known as 'full'), are what cvs/rsync users are familiar with- it holds |
90 |
checksums for all content. |
91 |
|
92 |
1.a) This converter must use portage api's; ultimately, this |
93 |
thin->thick conversion will be signed by an infra key (rather than the |
94 |
current hodgepodge of devs). I suggest nesting it under the emaint |
95 |
command. |
96 |
|
97 |
1.b) This converter needs to be fast. $VCS -> rsync updates occur |
98 |
every 30 minutes. thin/thick sorting should be sub minute, frankly; |
99 |
go parallel (multiprocessing) being my suggestion, threadpool worst |
100 |
case (since most of the work won't be gil bound). |
101 |
|
102 |
1.c) This absolutely has to be fucking stable. This will be a core |
103 |
part of our infrastructure after all. |
104 |
|
105 |
1.d) I will kneecap the first person who whines about portage on this, |
106 |
or suggests NIH "lets just hack it"- they won't have to support it, |
107 |
this goes into portage so it's proper, and so infra isn't stuck w/ |
108 |
more custom code. |
109 |
|
110 |
1.e) This actually isn't that hard. Ask in #gentoo-portage for |
111 |
details, look at portage source, look at repoman's existing manifest |
112 |
command- that manifest command already is the basics of it. |
113 |
|
114 |
1.5) Incremental signing of a tree is basically required; meaning |
115 |
whatever scanner there is, shouldn't require resigning every single |
116 |
package, only those that have changed thick manifest wise. |
117 |
|
118 |
1.6) Anyone looking to do this should pop into #gentoo-portage, talk |
119 |
w/ a user named 'carebear', zmedico, etc; zmedico is portage's |
120 |
maintainer, carebear is the current person volunteering to sort this |
121 |
(help may be appreciated, talk to him/her/it). |
122 |
|
123 |
|
124 |
2) Building off of #1, although *NOT REQUIRED FOR CVS->GIT MIGRATION*, |
125 |
just very strongly desired, is sorting tree signing gleps while we're |
126 |
at it. Start from http://www.gentoo.org/proj/en/glep/glep-0057.html ; |
127 |
whatever solution #1 takes (likely an emaint command), tree signing |
128 |
will be built right smack dab into it. |
129 |
|
130 |
|
131 |
3) Robin afaik is putting together an email with the details; roughly, |
132 |
the conversion process is conversion of cvs to svn, then svn2git |
133 |
conversion; this is done since frankly it's the best/sanest conversion |
134 |
pathway, and the fastest. The validation of that conversion, and |
135 |
getting it down to basically a set of known invocations is required. |
136 |
|
137 |
3.a) Roughly, the plan will be snag the tree, start conversion. |
138 |
Validate the results, repeat as necessary till we're happy with it. |
139 |
This is the initial git core history, This step should be <8h; mostly |
140 |
cpu time, frankly, although re-validation of that pathway is required |
141 |
(I did a fair amount of optimization to this, but I've not rechecked |
142 |
the runtime in a while- nor if there is a better option in existence). |
143 |
Basically, it's strongly preferable we're not sorting this at the time |
144 |
we're trying to do the live conversion- the core issues need to be |
145 |
sorted before. |
146 |
|
147 |
3.b) Take all cvs activity that has occurred since the tree was |
148 |
snapshotted and conversion started, and replay it into git via tailor; |
149 |
this is minor- and avoidable if we just shut the tree down for however |
150 |
long 3.a takes; that said, the tailor route is the intention, and |
151 |
shouldn't be a problem. |
152 |
|
153 |
|
154 |
4) People who strongly know git hooks would be useful; server side, |
155 |
all incoming pushes from devs will have their commits validated before |
156 |
touching the tree- bad validation, commit gets kicked back to them. |
157 |
The hooks for this need doing (development of this can be done locally |
158 |
w/out having to access infra either). Hell, someone may already have |
159 |
done something similar- I've not seen it, but we need something akin |
160 |
to this; whoever does this, needs to write it such that the auth |
161 |
backend is configurable (upon deployment, this will be bound into |
162 |
ldap, or an ldap scraped set of data that it'll consult); assume that |
163 |
the auth backend will be user->gpg key level of validation (meaning I |
164 |
cannot take a random commit antarus had against current ToT, and push |
165 |
that on his behalf- robin may disagree on this point however). |
166 |
|
167 |
|
168 |
|
169 |
Were that to be done, that would leave for infra basically the |
170 |
following- which is most definitely not a complete list- |
171 |
|
172 |
1) gitolite configuration/setup, which afaik is basically sorted. |
173 |
2) cvs -> rsync pathways being rebuilt to be git -> rsync (reliant on |
174 |
#1 from above, but there is more that occurs there). |
175 |
3) Thanking people for stepping up and helping to take care of the |
176 |
stuff we're seriously low on time to sort. |
177 |
|
178 |
People don't step up, I'll be working my way through that list; that |
179 |
said, my timetable were I to do this isn't "next week or the week |
180 |
after"- it's "over the next few months as time allows". |
181 |
|
182 |
Also, it's entirely possible I missed something for the non-infra |
183 |
tasks people can contribute to; that's just a quick brain dump, pardon |
184 |
any incorrect statements. If one has questions and answers aren't |
185 |
coming through via the scm ml, then worst case track me down on |
186 |
freenode via the ferringb nick; just assume I'll be wickedly laggy |
187 |
in responding. |
188 |
|
189 |
Finally, pardon the strong tones; the tone in use isn't meant to |
190 |
dissuade people from contributing, it's meant to ensure people stay |
191 |
focused on what's required here to get the job done- discussions about |
192 |
building a git mirroring tier (for example) are for *after* the |
193 |
initial work is done (understand that 99% of users will be using rsync |
194 |
even when we switch dev's underlying vcs got git; longer term that may |
195 |
change, but it's a v2 type thing, not a v1 type thing). |
196 |
|
197 |
Cheers- |
198 |
~harring |