Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev@l.g.o
Cc: infra-bugs@g.o, qa@g.o, council@g.o
Subject: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Date: Sun, 14 Sep 2014 12:04:10
Message-Id: 20140914140344.6c6b99e5@pomiot.lan
1 Hi,
2
3 I'm quite tired of promises and all that perfectionist non-sense which
4 locks us up with CVS for next 10 years of bikeshed. Therefore, I have
5 prepared a plan how to do git migration, and I believe it's doable in
6 less than 2 weeks (plus the testing). Of course, that assumes infra is
7 going to cooperate quickly or someone else is willing to provide the
8 infra for it.
9
10 I can provide some testing repos once someone is willing to provide
11 the hardware.
12
13
14 What needs to be done
15 ---------------------
16
17 I can do most of the scripting. What I need others to do is provide
18 the hosting for git repos. We can't use public services like github
19 since they don't allow us to set our own update hook, so we can't
20 enforce signing policies etc.
21
22 Once basic infra is ready, I think the following is the best way to
23 switch:
24
25 1. send announcement to devs to explain how to use git,
26
27 2. lock CVS out to read-only,
28
29 3. create all the git repos, get hooks rolling,
30
31 4. enable R/W access to the repos.
32
33 With some luck, no more than 2 hours downtime.
34
35
36 The infra
37 ---------
38
39 The general idea is based on 3-level structure that's extension of how
40 Funtoo works. The following ultimately pretty picture explains that:
41
42 +----------------+
43 | developer repo | - - - - - - - - - - -,
44 +----------------+ v
45 | +------------------------------+
46 | | cache, DTDs and other extras |
47 v +------------------------------+
48 +----------------+ |
49 | user sync repo | <--------------------'
50 +----------------+ - - - - - - - - - - ,
51 | v
52 | +-----------------------------+
53 | | ChangeLogs, thick Manifests |
54 v +-----------------------------+
55 +----------------+ |
56 | rsync | <-------------------'
57 +----------------+
58
59 Text version:
60
61 We have main developer repo where developers work & commit and are
62 relatively happy. For every push into developer repo, automated magic
63 thingie merges stuff into user sync repo and updates the metadata cache
64 there.
65
66 User sync repo is for power users than want to fetch via git. It's quite
67 fast and efficient for frequent updates, and also saves space by being free
68 of ChangeLogs.
69
70 On top of user sync repo rsync is propagated. The rsync tree is populated
71 with all old ChangeLogs copied from CVS (stored in 30M git repo), new
72 ChangeLogs are generated from git logs and Manifests are expanded.
73
74
75 Main developer repo
76 -------------------
77
78 I was able to create a start git repository that takes around 66M
79 as a git pack (this is how much you will have to fetch to start working
80 with it). The repository is stripped clean of history and ChangeLogs,
81 and has thin Manifests only.
82
83 This means we don't have to wait till someone figures out the perfect
84 way of converting the old CVS repository. You don't need that history
85 most of the time, and you can play with CVS to get it if you really do.
86 In any case, we would likely strip the history anyway to get a small
87 repo to work with.
88
89 I have prepared a basic git update hook that keeps master clean
90 and attached it to the bug [1]. It enforces basic policies, prevents
91 forced updates and checks GPG signatures on left-most history line. It
92 can also be extended to do more extensive tree checks.
93
94 For GPG signing, I relied upon gpg to do the right thing. That is, git
95 checks the signatures and we accept only trusted signatures. So
96 an external tool (gentoo-keys) need to play with gpg to import, trust
97 and revoke developer keys.
98
99 I think we should also merge gentoo-news & glsa & herds.xml into
100 the repository. They all reference Gentoo packages at a particular
101 state in time, and it would be much nicer to have them synced properly.
102
103 [1]:https://bugs.gentoo.org/show_bug.cgi?id=502060
104
105
106 User syncing repo
107 -----------------
108
109 IMO this will be the most useful syncing method. The user syncing repo
110 is updated automatically for developer repo commits, and afterwards
111 md5-cache is regenerated and committed. Also other repositories (like
112 DTDs, glsas and others if you dislike the previous idea) are merged
113 into it.
114
115 This repo is still free of ChangeLogs (since git logs are more
116 efficient) and has thin Manifests. It's the space-efficient Gentoo
117 variant. And commits are signed so users can verify the trust.
118
119
120 The rsync tree
121 --------------
122
123 We'd also propagate things to rsync. We'd have to populate it with old
124 ChangeLogs, new ChangeLog entries (autogenerated from git) and thick
125 Manifests. So users won't notice much of a change.
126
127 The remaining issue is signing of stuff. We could supposedly sign
128 Manifests but IMO it's a waste of resources considered how poor
129 the signing system is for non-git repos.
130
131 --
132 Best regards,
133 Michał Górny

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies