Gentoo Archives: gentoo-scm

From:	Rich Freeman <rich0@g.o>
To:	gentoo-scm@l.g.o
Cc:	ferringb@×××××.com
Subject:	[gentoo-scm] Re: [gentoo-dev] CVS -> git, list of where non-infra folk can contribute
Date:	Tue, 02 Oct 2012 21:21:34
Message-Id:	`CAGfcS_m4FGBy2mkQwmz+QWyN4F=PBYEVX7JvDYhaArXq71y+TA@mail.gmail.com`

1	On Tue, Oct 2, 2012 at 4:20 PM, Gregory M. Turner <gmt@×××××.us> wrote:
2	> Brian Harring wrote:
3	>>
4	>> replay it into git via tailor;
5	>>
6	>
7	> Never knew about that tool... not sure about the wisdom of adding an extra
8	> moving part just to keep the lights on for those few hours... Given the "2G
9	> of history" issue Diego mentioned, which if I understand correctly,
10	> effectively means that the future gentoo git can never rebase its commit
11	> history, why chance it?
12
13	I think that the reality is that we're going to have a million dress
14	rehearsals before we do the real thing. Apparently right now the
15	conversion isn't quite right, and we can't validate that it is right
16	either. I don't see any harm in having people look into being able to
17	keep the downtime low while others figure out how the migration works
18	in the first place.
19
20	Dress rehearsals don't need to even be announced. You just grab a
21	snapshot of cvs at some random time and convert it and test it. Then
22	you grab another snapshot at a later moment in time and try to use it
23	to catch up the converted repository. Then you test it all again. If
24	you can do that on demand without issue then I'd say we're ready to
25	go.
26
27	I do plan to mess around with validation as I posted yesterday.
28	Rather than dump a lot of time into a "clever" solution like Mapreduce
29	where I have no experience I'll probably just start with a single
30	threaded proof of concept and see just how long it takes. I have
31	thought of ways to optimize things - you can descend the tree of all
32	the commits iteratively side-by-side, and at each step prune every
33	sub-tree that is a duplicate (with a little care to catch situations
34	where the tree might have been reverted). That means that instead of
35	descending the entire tree for every commit you only actually descend
36	the branches that have changes on each commit, which in most cases
37	will just be a single branch anyway. If the records don't proliferate
38	at each step then you're talking about an order of a few million
39	records to check each pass, with only a few passes - that might be
40	reasonable without much heavy equipment. However, the job should
41	still be able to be run in parallel as long as you still run it in
42	stages.
43
44	I've got pseudocode for the git side - so I'll see what I can do with it.
45
46	Rich

Replies

Subject	Author
Re: [gentoo-scm] Re: [gentoo-dev] CVS -> git, list of where non-infra folk can contribute	Michael Mol <mikemol@×××××.com>
Re: [gentoo-scm] Re: [gentoo-dev] CVS -> git, list of where non-infra folk can contribute	Peter Stuge <peter@×××××.se>

Report Message

Find on MARC Find on Google Groups