Gentoo Archives: gentoo-scm

From: Donnie Berkholz <dberkholz@g.o>
To: "Robin H. Johnson" <robbat2@g.o>
Cc: gentoo-scm@l.g.o
Subject: Re: [gentoo-scm] Welcome to Gentoo-SCM discussion, for figuring out Gentoo beyond CVS
Date: Tue, 09 Sep 2008 17:45:10
Message-Id: 20080909174506.GA32182@comet
In Reply to: [gentoo-scm] Welcome to Gentoo-SCM discussion, for figuring out Gentoo beyond CVS by "Robin H. Johnson"
1 On 23:43 Mon 08 Sep , Robin H. Johnson wrote:
2 > 1.
3 > Other large projects that have either had or are conducting the
4 > discussions about switching SCMs
5 > - FreeBSD
6 > - KDE
7 > - Ruby on Rails
8 > - GHC (Haskell)
9 > Know of any more?
10
11 - X.Org <http://keithp.com/blog/Repository_Formats_Matter/>
12 <http://keithp.com/blogs/Tyrannical_SCM_selection/>
13 - OpenSolaris <http://opensolaris.org/os/community/tools/scm/history/>
14 - GNOME <http://live.gnome.org/DistributedSCM>
15 - Linux kernel
16 - Gentoo <http://www.gentoo.org/proj/en/infrastructure/cvs-migration.xml>
17 (Needs an update)
18
19 > 3.
20 > Migration tools:
21 > - cvs2svn looks best as it's easy to customize into doing what we want
22 > it to (Python).
23
24 Other existing ones, for completeness:
25 - fromcvs <http://www.selenic.com/mercurial/wiki/index.cgi/fromcvs>
26 - parsecvs <http://cgit.freedesktop.org/~keithp/parsecvs/>
27 - git cvsimport
28
29 In case we want to split the tree into multiple modules:
30 - git-split <people.freedesktop.org/~jamey/git-split> (This dies on
31 our tree at the moment, in part because of recursion limits.)
32
33 We should figure out our requirements for a migration tool to make sure
34 we're picking the best one. Here's what I can think of:
35
36 Important
37 - Handling huge numbers of changesets (See package.mask)
38 - Not mixing unrelated changesets with similar commit messages
39 ("Version bump.")
40 - Incremental imports (Just what's changed since last import. Useful
41 for initial migration)
42
43 Unimportant
44 - Handling branches correctly (we essentially never used them)
45 - Hardware requirements for migration tools
46
47 What else?
48
49 > 4.
50 > Doing more test migrations, and having a test-plan for comparing them
51 > directly, as well as against other SCMs.
52
53 The OpenSolaris link above is quite useful for comparisons, and the
54 "Repository Formats Matter" post from Keith Packard is helpful for
55 understanding one good reason why git might be the best choice.
56
57 Same as above, what are our requirements and what doesn't matter? Here's
58 the OpenSolaris list:
59 http://opensolaris.org/os/community/tools/scm/dscmreqdoc/
60
61 Important
62 - Fast branching (This will make it possible for new styles of
63 development in Gentoo.)
64 - Fast committing (This will encourage more atomic commits from a
65 functional POV.)
66 - Reliable (Repository format & committing process guarantee no
67 corruption.)
68 - Usability (This can be either discoverable or through good
69 documention, found elsewhere or produced by us.)
70 - Modifiable (Written in a reasonably common language. Read: Python, C
71 or shell. git and bzr qualify, darcs doesn't.)
72 - Active upstream (Getting modifications into upstream code,
73 requesting features)
74 - Hooks (Implement custom checks upon commit to your or main
75 repository.)
76
77 Optional
78 - Partial checkouts. They aren't useful enough to be a requirement, in
79 my view, because I have yet to hear a good reason they're needed. A
80 gig or two of disk space is cheap.
81 - Integration into popular text editors
82 - CVS gateway (people can still commit using CVS)
83 - Shallow checkouts (Only getting partial history to reduce size. git
84 supports grafting two repositories together, not sure about other
85 SCMs. Not sure how to do the initial splice. Explore
86 'git-filter-branch'?)
87
88 Unimportant
89 - ???
90
91 What else?
92
93
94 Another point I'd like to get into is how we should architect this.
95 Should we stick with the single repository for the whole thing, or
96 should we break it down so that each package has its own repository? If
97 we go with the latter, we need to figure out a way to easily check out &
98 update the whole repo.
99
100 We also encounter issues with atomic commits across multiple packages.
101 git has submodule support to partially address this, although it may
102 require slight enhancements so that it keeps all of the submodules at
103 HEAD instead of at arbitrary commits. This additionally runs into some
104 potential issues with duplication of history if packages move, etc. I
105 don't remember the details, but Robin knows about them.
106
107 One interesting possibility with the packages as separate repositories
108 thing is that we could have a flat structure of repositories and somehow
109 structure it into categories for rsync using some type of map. This
110 opens the door to using tags instead of categories. More thought needed.
111
112 --
113 Thanks,
114 Donnie
115
116 Donnie Berkholz
117 Developer, Gentoo Linux
118 Blog: http://dberkholz.wordpress.com