Gentoo Archives: gentoo-dev

From: "Robin H. Johnson" <robbat2@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package
Date: Sat, 06 Aug 2011 20:55:45
Message-Id: robbat2-20110806T203746-162139648Z@orbis-terrarum.net
In Reply to: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package by Fabian Groffen
1 On Sat, Aug 06, 2011 at 04:13:52PM +0200, Fabian Groffen wrote:
2 > When we migrate away from CVS for gentoo-x86 (gx86), as it looks now,
3 > the same structure will be kept as we have in CVS now. Policies to
4 > reject merge commits and only allow rebases on e.g. the Git
5 > infrastructure will even more closely match the central and
6 > server-based way of working Gentoo is used to now.
7 The discussion about rejecting merges was never completed IIRC. I think
8 there may be some very valid cases where we need merges still (esp the
9 big atomic commit cases from KDE/GNOME), but they should still be used
10 sparingly. Additionally, the rebase problem has problems of requiring
11 everybody else to hard-reset their trees if they have pushed to multiple
12 places, then rebase to push to the main tree, so I don't know if that
13 will actually fly.
14
15 > In this email, I step away from the current model that Gentoo uses for
16 > the gentoo-x86 repository. Instead, I consider a repo-per-package
17 > model, as in use by e.g. Fedora [1] and Debian [2].
18 Everything you have mentioned here was previously covered in the
19 discussions about Git conversion models. Please consult the history of
20 this list, as well as the -scm list. Additionally, a large discussion
21 about the pros and cons of all 3 models (package per repo, category per
22 repo, single repo) was had at the GSoC mentor summit last year, and a
23 number of the core Git developers were involved in the discussion.
24
25 Problems:
26 - atomic/well-ordered commits that span packages, eclasses and profiles/
27 directories. (Esp. committing to eclasses and then packages
28 afterwards).
29 - Massive space overhead: Every .git directory requires a minimum of 25
30 inodes [1], covering at least 100KiB. We have 15k packages in the tree
31 right now. Assuming there is no tail-packing in use, that's a minimum
32 of 1.5GiB on .git overhead.
33 - Massive space overhead(2): Having a repo per package also removes ANY
34 git compression advantage that would be gained where ebuilds between
35 packages are substantially similar. The _complete_ history packfile
36 for the Tree right is under 1GiB [2].
37 - Pain in branching/forking: instead of being able to just have your own
38 local clone of the single git repo, a user wanting to work on multiple
39 packages together would need to have repos for ALL of them. No
40 pull/merge ability at all.
41
42 [1] Git space usage testcase:
43 mkdir foo && cd foo && git init \
44 && touch bar && git commit -m '.' bar \
45 && git gc && du .git --exclude '*.sample' && find .git ! -name
46 '*.sample' |wc -l
47
48 [2] Packfile size:
49 The final proposal regarding packfile size was that we were going to
50 partition older history using grafts, similar to when Linus moved the
51 kernel into Git, and had a graft available of the old history. Initial
52 packfile size was under 50MiB.
53
54 --
55 Robin Hugh Johnson
56 Gentoo Linux: Developer, Trustee & Infrastructure Lead
57 E-Mail : robbat2@g.o
58 GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85

Replies