Gentoo Archives: gentoo-scm

From: Michael Haggerty <mhagger@××××××××.edu>
To: Brian Harring <ferringb@×××××.com>
Cc: gentoo-scm@l.g.o
Subject: Re: [gentoo-scm] CVS -> git, list of where non-infra folk can contribute
Date: Thu, 04 Oct 2012 15:45:54
Message-Id: 506DAF24.6050002@alum.mit.edu
In Reply to: Re: [gentoo-scm] CVS -> git, list of where non-infra folk can contribute by Brian Harring
1 On 10/03/2012 01:04 PM, Brian Harring wrote:
2 > On Wed, Oct 03, 2012 at 06:22:11AM +0200, Michael Haggerty wrote:
3 >> On 10/02/2012 06:15 AM, Brian Harring wrote:
4 >>> [...]
5 >>> 3) Robin afaik is putting together an email with the details; roughly,
6 >>> the conversion process is conversion of cvs to svn, then svn2git
7 >>> conversion; this is done since frankly it's the best/sanest conversion
8 >>> pathway, and the fastest. The validation of that conversion, and
9 >>> getting it down to basically a set of known invocations is required.
10 >>
11 >> If you are using cvs2svn/cvs2git for the conversion, the I don't
12 >> understand why you want to do the conversion via Subversion. cvs2git
13 >> (using the ExternalBlobGenerator) is much faster than cvs2svn.
14 >>
15 >> I look forward to hearing the details from Robin.
16 >
17 > Well, it's 2012 now... optimization was from '10 (that 9x boost to
18 > cvs2svn came from us after all).
19
20 One big boost (especially for the peculiar gentoo CVS repository
21 structure) came from your work on avoiding sorting when ordering commits.
22
23 Another big boost for cvs2git is to use the ExternalBlobGenerator, which
24 extracts revision contents to a "blob" file in CollectRevsPass using an
25 external script, thereby making OutputPass much faster.
26
27 If you are converting via Subversion, then you can't be taking advantage
28 of ExternalBlobGenerator and would also be suffering from the fact that
29 "svnadmin load" is much slower than "git fast-import", not to mention
30 the extra step of converting from Subversion to git.
31
32 So, if you are converting via Subversion and if performance is any kind
33 of concern, you should at least experiment with a direct conversion
34 using cvs2git.
35
36 By the way, the PostgreSQL project converted their repo directly to git
37 using cvs2git and did an extremely careful audit of the results.
38 cvs2git is still missing a few bells and whistles (for example,
39 .cvsignore files are not converted into .gitignore files, and things
40 like EOL conversion and keyword expansion are missing and/or awkward to
41 use). But other than that I think it is very much ready for production
42 use. Definitely use version 2.4.0, which I just released, as it
43 includes all the new goodness.
44
45 > One thing to keep in mind with our tree; it's not exactly your
46 > standard source tree- for ebuilds, they sohuld up, rarely get
47 > modified, then get rename to -r{whatever}, or slightly tweaked and as
48 > a new version.
49 >
50 > If memory serves- and it's 4am, so that trust is limited- the reasons
51 > for cvs2svn path were speed for our peculiar vcs history, and bugs in
52 > the cvs2git direct pathway.
53
54 More details and/or bug reports would be much appreciated if the
55 problems still exist in the 2.4.0 release (which is just out, though it
56 is well-used code with hardly any recent changes).
57
58 Michael
59
60 --
61 Michael Haggerty
62 mhagger@××××××××.edu

Replies

Subject Author
Re: [gentoo-scm] CVS -> git, list of where non-infra folk can contribute Brian Harring <ferringb@×××××.com>