1 |
Kent Fredric posted on Sun, 26 Jun 2011 17:43:27 +1200 as excerpted: |
2 |
|
3 |
> On 26 June 2011 15:49, Wyatt Epp <wyatt.epp@×××××.com> wrote: |
4 |
>> As for the latter part, the size of a git repo becoming umanageable |
5 |
>> over time had not occurred to me, I'm afraid-- would it work to use |
6 |
>> shallow clones? Otherwise, the herd-wise division is probably |
7 |
>> acceptable. Need to think about that one more. |
8 |
> |
9 |
> |
10 |
> --depth <depth> |
11 |
> Create a shallow clone with a history truncated to the |
12 |
> specified number of revisions. A shallow repository has a |
13 |
> number of limitations (you cannot clone or fetch from it, nor |
14 |
> push from nor into it), but is adequate if you are only |
15 |
> interested in the recent history of a large project with a |
16 |
> long history, and would want to send in fixes as patches. |
17 |
> |
18 |
> It would be ok perhaps for non-contributing users to use shallow clones, |
19 |
> but in my understanding, shallow clones limit you to doing what you |
20 |
> could do with a tar file of the specified revision, which basically |
21 |
> makes it impractical for people who are developing on it, |
22 |
> and would mean every new developer would get a progressively longer time |
23 |
> in order to do a complete check out. |
24 |
|
25 |
Not substantially so, no. |
26 |
|
27 |
FWIW, git scales VERY well in this regard, provided it's used for text- |
28 |
based content (sources) as originally intended. (It's not so hot at |
29 |
binary blob management, but it's not designed for that. Fortunately, |
30 |
gentoo's usage would be nearly 100% text-based.) |
31 |
|
32 |
What git does over time is compress the diffs into a series of packages |
33 |
(tarballs or whatever, I don't know the internals), and text compresses |
34 |
REALLY well. Then new checkouts grab the compressed packages, with only |
35 |
the last little bit being uncompressed. Existing users can run garbage- |
36 |
collection periodically to collect and compress their existing history |
37 |
into the packages as well. |
38 |
|
39 |
So for example, du says my kernel git tree totals 1.6 GB, including the |
40 |
active checkout and two separate (dirty) build trees. The bare git tree |
41 |
(history repo without working tree) itself is 891 MB. So the bare repo |
42 |
is only 54% of the total, and I've not actually garbage-collected in some |
43 |
time. If I had, the ratio would be closer to 50%, meaning the entire |
44 |
kernel git history repo compresses to roughly the size of the working |
45 |
tree, and only roughly doubles the size of a single decompressed working |
46 |
tarball. |
47 |
|
48 |
Over time that'll certainly grow a bit, but it really does scale well. |
49 |
The kernel has been in git for enough time now that there's quite some |
50 |
history built up, and that it only roughly doubles the size of a single |
51 |
decompressed working tree snapshot, while making available at my |
52 |
fingertips the entire history since original checkin, is impressive |
53 |
indeed. |
54 |
|
55 |
It's all down to how well the sources and diffs compress. If there were |
56 |
significant binary blobs in there (the kernel tree does have a few bits |
57 |
of firmware, the tux logo, etc), it would compress far less effectively. |
58 |
But gentoo's tree is pretty much all text as well, fortunately. =:^) |
59 |
|
60 |
-- |
61 |
Duncan - List replies preferred. No HTML msgs. |
62 |
"Every nonfree program has a lord, a master -- |
63 |
and if you use the program, he is your master." Richard Stallman |