Gentoo Archives: gentoo-scm

From: Brian Harring <ferringb@×××××.com>
To: Michael Haggerty <mhagger@××××××××.edu>
Cc: gentoo-scm@l.g.o
Subject: Re: [gentoo-scm] CVS -> git, list of where non-infra folk can contribute
Date: Wed, 03 Oct 2012 11:05:09
Message-Id: 20121003110456.GA21633@localhost
In Reply to: Re: [gentoo-scm] CVS -> git, list of where non-infra folk can contribute by Michael Haggerty
1 On Wed, Oct 03, 2012 at 06:22:11AM +0200, Michael Haggerty wrote:
2 > On 10/02/2012 06:15 AM, Brian Harring wrote:
3 > > [...]
4 > > 3) Robin afaik is putting together an email with the details; roughly,
5 > > the conversion process is conversion of cvs to svn, then svn2git
6 > > conversion; this is done since frankly it's the best/sanest conversion
7 > > pathway, and the fastest. The validation of that conversion, and
8 > > getting it down to basically a set of known invocations is required.
9 >
10 > If you are using cvs2svn/cvs2git for the conversion, the I don't
11 > understand why you want to do the conversion via Subversion. cvs2git
12 > (using the ExternalBlobGenerator) is much faster than cvs2svn.
13 >
14 > I look forward to hearing the details from Robin.
15
16 Well, it's 2012 now... optimization was from '10 (that 9x boost to
17 cvs2svn came from us after all).
18
19 One thing to keep in mind with our tree; it's not exactly your
20 standard source tree- for ebuilds, they sohuld up, rarely get
21 modified, then get rename to -r{whatever}, or slightly tweaked and as
22 a new version.
23
24 If memory serves- and it's 4am, so that trust is limited- the reasons
25 for cvs2svn path were speed for our peculiar vcs history, and bugs in
26 the cvs2git direct pathway.
27
28 This was long enough ago, it seriously wouldn't surprise me in the
29 least if you've changed the playing field so that our original
30 approach is no longer sane.
31
32 We're using your code one way or another, so advice is obviously
33 welcome.
34
35
36 > > 3.a) Roughly, the plan will be snag the tree, start conversion.
37 > > Validate the results, repeat as necessary till we're happy with it.
38 > > This is the initial git core history, This step should be <8h; mostly
39 > > cpu time, frankly, although re-validation of that pathway is required
40 > > (I did a fair amount of optimization to this, but I've not rechecked
41 > > the runtime in a while- nor if there is a better option in existence).
42 > > Basically, it's strongly preferable we're not sorting this at the time
43 > > we're trying to do the live conversion- the core issues need to be
44 > > sorted before.
45 >
46 > There is a program contrib/verify-cvs2svn.py that can be used for some
47 > verification of a conversion. It tests that the contents of the tips of
48 > all branches (including master) and of all tags are the same in CVS as
49 > in the converted repository. If the gentoo project tags the significant
50 > points in its history, then this is can give a lot of confidence in the
51 > conversion. (Though, I must admit, I have little experience with this
52 > script and don't know how easy it is to use.)
53
54 We tag nothing, and branch nothing. Just never fit our vcs usage
55 patterns.
56
57
58 > > 3.b) Take all cvs activity that has occurred since the tree was
59 > > snapshotted and conversion started, and replay it into git via tailor;
60 > > this is minor- and avoidable if we just shut the tree down for however
61 > > long 3.a takes; that said, the tailor route is the intention, and
62 > > shouldn't be a problem.
63 >
64 > In my opinion, it is more prudent to shut down CVS during the
65 > conversion. Otherwise, you effectively double the number of tools that
66 > you have to rely on and thus the number of possible points of failure.
67
68 Shutdown is my intent, and if the window is <4 hours, that's viable
69 (frankly easier to just force it, regardless of whinging).
70
71 If we're talking more like 8-12 hours... well, that may be a different
72 story.
73
74 Keep in mind that the the times quotes there, are just figured pulled
75 out of the ass- if memory serves, we had it at something like 2 hours,
76 which is a non issue. If it's larger times, then the secondary tailor
77 replay (which, note the validation of that fortunately isn't nearly as
78 painful- can be done manually since it'll be <50 revs) isn't horrible.
79 Best to keep in mind that a 4 hour maintenance window for something
80 like this, on occasion gains a multiplier or two. :)
81
82 ~harring

Replies

Subject Author
Re: [gentoo-scm] CVS -> git, list of where non-infra folk can contribute Michael Haggerty <mhagger@××××××××.edu>