Gentoo Archives: gentoo-scm

From: Brian Harring <ferringb@×××××.com>
To: gentoo-scm@l.g.o
Cc: "Michał Górny" <mgorny@g.o>, gentoo-dev <gentoo-dev@l.g.o>, infra-bugs@g.o, qa@g.o, Gentoo Council <council@g.o>
Subject: Re: [gentoo-scm] Re: My masterplan for git migration (+ looking for infra to test it)
Date: Tue, 16 Sep 2014 03:17:09
Message-Id: CAMMrfH5_dZXYZHqCt2dmn7LS7AOgZOD5N+t58q6Xw+8WVwA9Dg@mail.gmail.com
In Reply to: [gentoo-scm] Re: My masterplan for git migration (+ looking for infra to test it) by Rich Freeman
1 On Sun, Sep 14, 2014 at 10:33 AM, Rich Freeman <rich0@g.o> wrote:
2
3 > On Sun, Sep 14, 2014 at 8:03 AM, Michał Górny <mgorny@g.o> wrote:
4 > >
5 > > I'm quite tired of promises and all that perfectionist non-sense which
6 > > locks us up with CVS for next 10 years of bikeshed.
7 >
8 > While I tend to agree with the sentiment, I don't think you're
9 > actually targeting the problems that aren't already solved here.
10 >
11 > > Of course, that assumes infra is
12 > > going to cooperate quickly or someone else is willing to provide the
13 > > infra for it.
14 >
15 > The infra components to a git infrastructure are one of the main
16 > blockers at this point. I don't really see cooperation as the issue -
17 > just lack of manpower or interest.
18 >
19 > >
20 > > I can provide some testing repos once someone is willing to provide
21 > > the hardware.
22 >
23 > We already have plenty of testing repos (well, minus all the back-end
24 > stuff).
25 >
26 > >
27 > > 1. send announcement to devs to explain how to use git,
28 >
29 > This is one of the blockers. We haven't actually decided how we want
30 > to use git.
31 >
32 > Sure, everybody knows how to use git. The problem is that there are a
33 > dozen different ways we COULD use git, and nobody has picked the ONE
34 > way we WILL use it.
35 >
36 > This isn't as trivial as you might think. We have a fairly high
37 > commit rate and with a single repository that means that in-between a
38 > pull-merge/rebase-push there is a decent chance of another commit that
39 > will make the resulting push a non-fast-forward.
40 >
41 > People love to point out linux and its insane commit rate. The thing
42 > is, the mainline git repo with all those commits has exactly one
43 > committer - Linus himself. They don't have one big repo with one
44 > master branch that everybody pushes to. At least, that is my
45 > understanding (and there are certainly others here who are more
46 > involved with kernel development).
47 >
48 > >
49 > > 2. lock CVS out to read-only,
50 > >
51 > > 3. create all the git repos, get hooks rolling,
52 > >
53 > > 4. enable R/W access to the repos.
54 > >
55 > > With some luck, no more than 2 hours downtime.
56 >
57 > I agree that the actual conversion should be able to done quickly.
58 >
59 > > On top of user sync repo rsync is propagated. The rsync tree is populated
60 > > with all old ChangeLogs copied from CVS (stored in 30M git repo), new
61 > > ChangeLogs are generated from git logs and Manifests are expanded.
62 >
63 > So, I don't really have a problem with your design. I still question
64 > whether we still need to be generating changelogs - they seem
65 > incredibly redundant. But, if people really want a redundant copy of
66 > the git log, whatever...
67 >
68 > > Main developer repo
69 > > -------------------
70 > >
71 > > I was able to create a start git repository that takes around 66M
72 > > as a git pack (this is how much you will have to fetch to start working
73 > > with it). The repository is stripped clean of history and ChangeLogs,
74 > > and has thin Manifests only.
75 > >
76 > > This means we don't have to wait till someone figures out the perfect
77 > > way of converting the old CVS repository. You don't need that history
78 > > most of the time, and you can play with CVS to get it if you really do.
79 > > In any case, we would likely strip the history anyway to get a small
80 > > repo to work with.
81 >
82 > We already have a migration process that coverts the old CVS
83 > repository, generating both a shallow repository that lacks history
84 > and a full repository that contains all of history. Additionally,
85 > these two are consistent - that is the last branch of the full
86 > repository has the same commit ID as the base of the shallow
87 > repository. Basically we generate the full history and then trim out
88 > 99% of it so that the commit in the shallow repository points to a
89 > parent that isn't in the packed repository.
90 >
91 > Actually doing the conversion is basically a solved problem. If this
92 > were actually the blocker I'd be all for just sticking the history in
93 > a different repo and starting from scratch with a new one.
94 >
95 > >
96 > > I think we should also merge gentoo-news & glsa & herds.xml into
97 > > the repository. They all reference Gentoo packages at a particular
98 > > state in time, and it would be much nicer to have them synced properly.
99 > >
100 >
101 > I can see the pros/cons here, but I don't personally have an issue
102 > with merging them. As has been brought up elsewhere herds.xml may
103 > just go away.
104 >
105 > If somebody can come up with a set of hooks/scripts that will create
106 > the various trees and the only thing that is left is to get infra to
107 > host them, I think we can make real progress. I don't think this is
108 > something that needs to take a long time. The pieces are mostly there
109 > - they just have to be assembled.
110 >
111 > --
112 > Rich
113 >
114 >