Gentoo Archives: gentoo-scm

From:	Rich Freeman <rich0@g.o>
To:	"Michał Górny" <mgorny@g.o>
Cc:	gentoo-dev <gentoo-dev@l.g.o>, infra-bugs@g.o, qa@g.o, Gentoo Council <council@g.o>, gentoo-scm@l.g.o
Subject:	[gentoo-scm] Re: My masterplan for git migration (+ looking for infra to test it)
Date:	Sun, 14 Sep 2014 14:33:10
Message-Id:	`CAGfcS_nRN1EMkx-bqE=M1y=YgEumapsPTg5_kP2MgVY1+4og2Q@mail.gmail.com`

1	On Sun, Sep 14, 2014 at 8:03 AM, Michał Górny <mgorny@g.o> wrote:
2	>
3	> I'm quite tired of promises and all that perfectionist non-sense which
4	> locks us up with CVS for next 10 years of bikeshed.
5
6	While I tend to agree with the sentiment, I don't think you're
7	actually targeting the problems that aren't already solved here.
8
9	> Of course, that assumes infra is
10	> going to cooperate quickly or someone else is willing to provide the
11	> infra for it.
12
13	The infra components to a git infrastructure are one of the main
14	blockers at this point. I don't really see cooperation as the issue -
15	just lack of manpower or interest.
16
17	>
18	> I can provide some testing repos once someone is willing to provide
19	> the hardware.
20
21	We already have plenty of testing repos (well, minus all the back-end stuff).
22
23	>
24	> 1. send announcement to devs to explain how to use git,
25
26	This is one of the blockers. We haven't actually decided how we want
27	to use git.
28
29	Sure, everybody knows how to use git. The problem is that there are a
30	dozen different ways we COULD use git, and nobody has picked the ONE
31	way we WILL use it.
32
33	This isn't as trivial as you might think. We have a fairly high
34	commit rate and with a single repository that means that in-between a
35	pull-merge/rebase-push there is a decent chance of another commit that
36	will make the resulting push a non-fast-forward.
37
38	People love to point out linux and its insane commit rate. The thing
39	is, the mainline git repo with all those commits has exactly one
40	committer - Linus himself. They don't have one big repo with one
41	master branch that everybody pushes to. At least, that is my
42	understanding (and there are certainly others here who are more
43	involved with kernel development).
44
45	>
46	> 2. lock CVS out to read-only,
47	>
48	> 3. create all the git repos, get hooks rolling,
49	>
50	> 4. enable R/W access to the repos.
51	>
52	> With some luck, no more than 2 hours downtime.
53
54	I agree that the actual conversion should be able to done quickly.
55
56	> On top of user sync repo rsync is propagated. The rsync tree is populated
57	> with all old ChangeLogs copied from CVS (stored in 30M git repo), new
58	> ChangeLogs are generated from git logs and Manifests are expanded.
59
60	So, I don't really have a problem with your design. I still question
61	whether we still need to be generating changelogs - they seem
62	incredibly redundant. But, if people really want a redundant copy of
63	the git log, whatever...
64
65	> Main developer repo
66	> -------------------
67	>
68	> I was able to create a start git repository that takes around 66M
69	> as a git pack (this is how much you will have to fetch to start working
70	> with it). The repository is stripped clean of history and ChangeLogs,
71	> and has thin Manifests only.
72	>
73	> This means we don't have to wait till someone figures out the perfect
74	> way of converting the old CVS repository. You don't need that history
75	> most of the time, and you can play with CVS to get it if you really do.
76	> In any case, we would likely strip the history anyway to get a small
77	> repo to work with.
78
79	We already have a migration process that coverts the old CVS
80	repository, generating both a shallow repository that lacks history
81	and a full repository that contains all of history. Additionally,
82	these two are consistent - that is the last branch of the full
83	repository has the same commit ID as the base of the shallow
84	repository. Basically we generate the full history and then trim out
85	99% of it so that the commit in the shallow repository points to a
86	parent that isn't in the packed repository.
87
88	Actually doing the conversion is basically a solved problem. If this
89	were actually the blocker I'd be all for just sticking the history in
90	a different repo and starting from scratch with a new one.
91
92	>
93	> I think we should also merge gentoo-news & glsa & herds.xml into
94	> the repository. They all reference Gentoo packages at a particular
95	> state in time, and it would be much nicer to have them synced properly.
96	>
97
98	I can see the pros/cons here, but I don't personally have an issue
99	with merging them. As has been brought up elsewhere herds.xml may
100	just go away.
101
102	If somebody can come up with a set of hooks/scripts that will create
103	the various trees and the only thing that is left is to get infra to
104	host them, I think we can make real progress. I don't think this is
105	something that needs to take a long time. The pieces are mostly there
106	- they just have to be assembled.
107
108	--
109	Rich

Replies

Subject	Author
[gentoo-scm] Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)	"Michał Górny" <mgorny@g.o>
Re: [gentoo-scm] Re: My masterplan for git migration (+ looking for infra to test it)	Brian Harring <ferringb@×××××.com>
Re: [gentoo-scm] Re: My masterplan for git migration (+ looking for infra to test it)	Brian Harring <ferringb@×××××.com>

Report Message

Find on MARC Find on Google Groups