On 23:53 Tue 26 Oct , Robin H. Johnson wrote:
> On Tue, Oct 26, 2010 at 07:15:46PM +0000, Robin H. Johnson wrote:
> > Repo layout
> > * Natural option (seems to be used by most projects) is one package per repo
> > * Main problems: how to manage initial clones, updates, package moves, category moves
> > * The "repo" tool written for Android can handle most of this
> > * Renaming packages a problem (requires admin participation)
> > * Average 1-2/week over the past few years
> > * Moving packages between categories can be done by committers
> The other major problem with splitting the repo, is that the overhead
> that will be imposed by it. This got lost during the talks, but I've
> followed up with warthog9, and the size is going to hurt.
> In an attempt to explain that better, I've laid out below, the (inode)
> overhead costs per checkouts in each VCS, in a variant of O-notation.
> d = number of directories, f = number of files, r = number of
> CVS: O(3d)
> SVN: O(8d + 2f)
> Git: O(35r)
Very interesting numbers, Robin!
By making the (simplistic?) assumption that the above O(foo) is really
=foo, I attempted to calculate the direct comparisons of disk space that
follow from the info you provided.
In that case, SVN, Hg and Git all use at least 4x the space of CVS for
per-package, and SVN doesn't even differentiate between one big repo or
per-package. A repo-per-package model in Git would be nearly equivalent
in inode overhead to using SVN. Is that really a bad thing?
With 4K blocks, that works out at roughly 500 MB (CVS) to 2 GB (SVN,
Git) of inode overhead. I have a hard time imagining people so hard up
for disk space that they can fit the whole git repo but can't find
another 1.5 GB.
Based on my current git conversion with a pack size of 1.7 GB, I suppose
that means the total repo in a git world could vary from ~2 GB all the
way up to ~4 GB.
Sr. Developer, Gentoo Linux