1 |
Rich Freeman <rich0@g.o> wrote: |
2 |
> |
3 |
> Clearly it doesn't increase by a factor of 1 every year |
4 |
|
5 |
The yearly increase of the factor is rather precisely 1: |
6 |
According to current data, it is .95, see below. |
7 |
With xz compression for squashfs, it is even 1.4! |
8 |
|
9 |
(Note: increase _of_ the factor, not _by_ the factor, of course; |
10 |
we are speaking about a linear increase, not an exponential one.) |
11 |
|
12 |
More precisely: If in both cases you extremeley optimize for space |
13 |
(details see below) then a change from rsync to git (non-shallow) |
14 |
costs you |
15 |
|
16 |
a) now: the factor 2.6 of needed disk space |
17 |
|
18 |
b) in future for every year this factor is increased |
19 |
by the summand 1.4. For example, in 2.5 years you will need roughly |
20 |
2.6 + (1.4 * 2.5) = 6.1 times the disk space than for rsync. |
21 |
After 2.5 more years, the factor will be more than 10. |
22 |
|
23 |
For a) I assumed that in both cases the current repository is kept |
24 |
compressed with squashfs (xz). This first factor will be much |
25 |
larger, of course, if you omit squashfs when you switch to git. |
26 |
(You must take measurements to keep the checked-out repository separate: |
27 |
you cannot use standard emerge --sync to get this optimization.) |
28 |
|
29 |
For both numbers, I even optimized the .git compression by |
30 |
executing repeatedly |
31 |
git prune; git repack -a -d; git gc --agressive |
32 |
which for the historical repository took several hours; |
33 |
thus, unless you use a cron-job, this is not realistic. |
34 |
Without this optimization, both numbers would be even larger. |
35 |
|
36 |
Here are the plain data I used for the calculation: |
37 |
|
38 |
1. RSYNC = 84,062,208 |
39 |
(rsync gentoo repository, compressed with squashfs (-comp xz).) |
40 |
|
41 |
2. GIT = 136,322,616 |
42 |
(Current .git data, without checked-out tree; |
43 |
compression optimized by the time-costly commands above.) |
44 |
|
45 |
3. FULL = 1,923,685,435 |
46 |
(.git data as in 2, but with history added) |
47 |
|
48 |
4. YEARS = 15 |
49 |
(length of the historical data: first checkin was June 2000; |
50 |
change to git was IIRC somewhere in middle 2015). |
51 |
|
52 |
So the number from a) is |
53 |
|
54 |
size with git $GIT + $RSYNC |
55 |
--------------- = ------------- ~ 2.6 |
56 |
size with rysnc $RSYNC |
57 |
|
58 |
The number from b) is |
59 |
|
60 |
size of history increase per year ($FULL - $GIT) / $YEARS |
61 |
--------------------------------- = ------------------------ ~ 1.4 |
62 |
size with rsync $RSYNC |
63 |
|
64 |
In the previus postings, I was assuming the much faster squashfs |
65 |
compression -comp lz4 -Xhc instead of -comp xz. In this case, |
66 |
the number from 1 changes to |
67 |
|
68 |
RSYNC = 125784064 |
69 |
|
70 |
which leads to the factor .95 ~ 1 for b) which I mentioned in the |
71 |
beginning. |