On Wed, Oct 27, 2010 at 01:53, Robin H. Johnson <robbat2@g.o> wrote:
> The other major problem with splitting the repo, is that the overhead
> that will be imposed by it. This got lost during the talks, but I've
> followed up with warthog9, and the size is going to hurt.
>
> In an attempt to explain that better, I've laid out below, the (inode)
> overhead costs per checkouts in each VCS, in a variant of O-notation.
> d = number of directories, f = number of files, r = number of
> repositories.
> CVS: O(3d)
> SVN: O(8d + 2f)
> Git: O(35r)
>
> All of which represent the bare minimum number of inodes are required.
>
> Our CVS tree tracks 21481 directories, and 118286 files.
> 14302 of those directories are packages.
>
> For the 3 models of Git:
> 1 giant repo: 1 repo only.
> repo-per-package: 14302 repos
> repo-per-category: 154 repos.
Does repo work with a tracker repo that keeps the state of all the
child repos, so you still get consistent state? I don't have much
experience with git, but I'm a hg developer, so I know a thing or two
about VCS.
The interesting part is where the tools are such that devs can just
not get all the packages/categories they don't work on. If that's
impossible, then I think just going with a single repo is probably
best. From what I know about how git works, I'd also say that git will
deal with a large tree alright (which would be a larger problem in hg,
I think).
Cheers,
Dirkjan
|