Gentoo Archives: gentoo-dev

From: "Robin H. Johnson" <robbat2@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Using emerge-webrsync to simplify the handbook
Date: Sat, 01 Dec 2012 22:44:59
Message-Id: robbat2-20121201T213056-338464740Z@orbis-terrarum.net
In Reply to: Re: [gentoo-dev] Using emerge-webrsync to simplify the handbook by Zac Medico
1 On Fri, Nov 30, 2012 at 09:35:07AM -0800, Zac Medico wrote:
2 > > However, I'm not aware of gnu tar's incremental archive. If it's much
3 > > faster than the above, then it should probably replace
4 > > emerge-delta-webrsync.
5 > If it has benefits over the current diffball approach used by
6 > emerge-delta-webrsync, then it seems like a good idea. It would be nice
7 > to integrate it directly into emerge-webrsync, and eventually deprecate
8 > emerge-delta-webrsync.
9 I went and did a rough comparison of Tar incrementals vs the existing
10 deltas.
11
12 TL;DR:
13 ======
14 - Existing deltas are 8-9x better than other options.
15 - We should consider retaining monthly snapshots, plus all the deltas.
16
17 Results:
18 ========
19 1.
20 Using bzip2 -9 compression:
21 - Existing deltas are 9x smaller than tar-incremental.
22 - Existing deltas are 8x smaller than rsync-batch.
23
24 2.
25 If you just want to save bandwidth, the average full snapshot,
26 compressed w/ BZIP2, is 55M. The average delta is 269k.
27 55M/269k = ~209.
28 Ergo it is LESS bandwidth to download ~180 deltas and apply those than
29 it is to download the full snapshot (assuming upstream side of the
30 transaction accounts for ~30 snapshots worth of overhead).
31
32 Notes:
33 ======
34 1.
35 Extracting tar incrementals, you must be VERY careful to perform
36 operations in the correct order, otherwise removed files will not
37 actually be deleted.
38
39 2.
40 When the Git repo goes live, we should tag at the point we take the
41 daily snapshot, and use this to also consider git bundles.
42
43 Numbers:
44 ========
45
46 Baseline tarball:
47 57919736 portage-20121123.0.tar.bz2
48
49 Tar incrementals, daily:
50 2554334 portage-20121123-20121124.1.tar.bz2
51 2045216 portage-20121124-20121125.1.tar.bz2
52 1936313 portage-20121125-20121126.1.tar.bz2
53 2355342 portage-20121126-20121127.1.tar.bz2
54 2063612 portage-20121127-20121128.1.tar.bz2
55 2582600 portage-20121128-20121129.1.tar.bz2
56 2720135 portage-20121129-20121130.1.tar.bz2
57
58 Rsync incrementals, daily:
59 2224311 portage-20121123-20121124.rsync-batch.bz2
60 1869241 portage-20121124-20121125.rsync-batch.bz2
61 1802648 portage-20121125-20121126.rsync-batch.bz2
62 1936937 portage-20121126-20121127.rsync-batch.bz2
63 1868771 portage-20121127-20121128.rsync-batch.bz2
64 2240386 portage-20121128-20121129.rsync-batch.bz2
65 2028207 portage-20121129-20121130.rsync-batch.bz2
66
67 Existing deltas, daily:
68 252400 snapshot-20121123-20121124.patch.bz2
69 267094 snapshot-20121124-20121125.patch.bz2
70 161136 snapshot-20121125-20121126.patch.bz2
71 225349 snapshot-20121126-20121127.patch.bz2
72 245804 snapshot-20121127-20121128.patch.bz2
73 232549 snapshot-20121128-20121129.patch.bz2
74 332835 snapshot-20121129-20121130.patch.bz2
75
76 Rsync incrementals, from baseline:
77 2224311 portage-20121123-20121124.rsync-batch.bz2
78 2536620 portage-20121123-20121125.rsync-batch.bz2
79 2700715 portage-20121123-20121126.rsync-batch.bz2
80 2986403 portage-20121123-20121127.rsync-batch.bz2
81 3258723 portage-20121123-20121128.rsync-batch.bz2
82 3824015 portage-20121123-20121129.rsync-batch.bz2
83 4232674 portage-20121123-20121130.rsync-batch.bz2
84
85 --
86 Robin Hugh Johnson
87 Gentoo Linux: Developer, Trustee & Infrastructure Lead
88 E-Mail : robbat2@g.o
89 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85