Gentoo Archives: gentoo-project

From: Raymond Jennings <shentino@×××××.com>
To: gentoo-project@l.g.o
Subject: Re: [gentoo-project] RFC: Dropping rsync as a tree distribution method
Date: Tue, 18 Dec 2018 11:36:54
Message-Id: CAGDaZ_p88TOB0ufrtZO3ebR5YzyCZn6NREANddOieM=w6NVGqw@mail.gmail.com
In Reply to: Re: [gentoo-project] RFC: Dropping rsync as a tree distribution method by Andrew Savchenko
1 On Tue, Dec 18, 2018 at 1:56 AM Andrew Savchenko <bircoph@g.o> wrote:
2 > On Sat, 15 Dec 2018 23:15:47 -0500 Alec Warner wrote:
3 > > Hi,
4 > >
5 > > I am currently embarking on a plan to redo our existing rsync[0] mirror
6 > > network. The current network has aged a bit. Its likely too large and is
7 > > under-maintained. I think in the ideal case we would instead pivot this
8 > > project to scaling out our git mirror capabilities and slowly migrate all
9 > > consumers to pulling the git tree directly. To that end, I'm looking for
10 > > blockers as to why various customers cannot switch to pulling the gentoo
11 > > ebuild repository from git[1] instead of rsync.
12 > >
13 > > So for example:
14 > >
15 > > - bandwidth concerns (preferably with documentation / data.)
16 > > - Firewall concerns
17 > > - CPU concerns (e.g. rsync is great for tiny systems?)
18 > > - Disk usage for git vs rsync
19 > > - Other things i have not thought of.
20 >
21 > My main concern with git is downlink fault tolerance. If rsync
22 > connection is broken, it can be easily restored without much data
23 > retransmission. If git download connection is broken, it has to
24 > start all over again. So there are cases where rsync will be always
25 > much more preferable than git.
26
27 Are you talking about in comparison to the initial clone?
28 If so, would having the clone default to shallow mitigate this?
29
30 For the curious, I ran a benchmark.
31
32 With a completely purged /usr/portage:
33
34 emerge-webrsync took 30.302s
35 emerge-sync (with git clone --depth 1) took 33.902s
36 emerge-sync (with regular rsync) took a whoping 1m25.863s
37
38 After a fresh sync:
39
40 emerge-sync (with regular rsync) took 7.564s
41 emerge-sync (with git fetch --depth 1, and after priming the repo with
42 a full clone) took 2.086s
43
44
45
46 Up front, webrsync seems to be a small winner for initial setups, with
47 git clone a close second, and regular rsync is 3 fold worse
48
49 Routine syncs would seem to prefer git, especially if they are done
50 with presistent regularity which IMO would amortize things. My
51 opinion is that over time git would also place less stress on the
52 servers since it only has to look at the commit chain instead of
53 checksumming every single file.
54
55
56
57 That said, would I be correct to surmise that you're advancing a
58 robustness issue and not simply a performance issue?
59
60
61 > Best regards,
62 > Andrew Savchenko

Replies

Subject Author
Re: [gentoo-project] RFC: Dropping rsync as a tree distribution method Andrew Savchenko <bircoph@g.o>