Gentoo Archives: gentoo-scm

From: Donnie Berkholz <dberkholz@g.o>
To: "Robin H. Johnson" <robbat2@g.o>
Cc: gentoo-scm@l.g.o
Subject: Re: [gentoo-scm] meeting followup: repo layout
Date: Sat, 30 Oct 2010 02:28:44
Message-Id: 20101030022833.GC9386@comet
In Reply to: [gentoo-scm] meeting followup: repo layout by "Robin H. Johnson"
1 On 23:53 Tue 26 Oct , Robin H. Johnson wrote:
2 > On Tue, Oct 26, 2010 at 07:15:46PM +0000, Robin H. Johnson wrote:
3 > > Repo layout
4 > > * Natural option (seems to be used by most projects) is one package per repo
5 > > * Main problems: how to manage initial clones, updates, package moves, category moves
6 > > * The "repo" tool written for Android can handle most of this
7 > > * Renaming packages a problem (requires admin participation)
8 > > * Average 1-2/week over the past few years
9 > > * Moving packages between categories can be done by committers
10 > The other major problem with splitting the repo, is that the overhead
11 > that will be imposed by it. This got lost during the talks, but I've
12 > followed up with warthog9, and the size is going to hurt.
13 >
14 > In an attempt to explain that better, I've laid out below, the (inode)
15 > overhead costs per checkouts in each VCS, in a variant of O-notation.
16 > d = number of directories, f = number of files, r = number of
17 > repositories.
18 > CVS: O(3d)
19 > SVN: O(8d + 2f)
20 > Git: O(35r)
21
22 Very interesting numbers, Robin!
23
24 By making the (simplistic?) assumption that the above O(foo) is really
25 =foo, I attempted to calculate the direct comparisons of disk space that
26 follow from the info you provided.
27
28 In that case, SVN, Hg and Git all use at least 4x the space of CVS for
29 per-package, and SVN doesn't even differentiate between one big repo or
30 per-package. A repo-per-package model in Git would be nearly equivalent
31 in inode overhead to using SVN. Is that really a bad thing?
32
33 With 4K blocks, that works out at roughly 500 MB (CVS) to 2 GB (SVN,
34 Git) of inode overhead. I have a hard time imagining people so hard up
35 for disk space that they can fit the whole git repo but can't find
36 another 1.5 GB.
37
38 Based on my current git conversion with a pack size of 1.7 GB, I suppose
39 that means the total repo in a git world could vary from ~2 GB all the
40 way up to ~4 GB.
41
42 --
43 Thanks,
44 Donnie
45
46 Donnie Berkholz
47 Sr. Developer, Gentoo Linux
48 Blog: http://dberkholz.wordpress.com

Replies

Subject Author
[gentoo-scm] repo layout & graft / split-history "Robin H. Johnson" <robbat2@g.o>