Gentoo Archives: gentoo-project

From: Raymond Jennings <shentino@×××××.com>
To: gentoo-project@l.g.o
Subject: Re: [gentoo-project] RFC: Dropping rsync as a tree distribution method
Date: Tue, 18 Dec 2018 18:39:30
Message-Id: CAGDaZ_odTJH+ON+B0sw5pSYCmeZAARjYs_51EJ4VOpxydadxFQ@mail.gmail.com
In Reply to: Re: [gentoo-project] RFC: Dropping rsync as a tree distribution method by Alec Warner
1 What if as a first step, rsync was only dropped as the default?
2
3 If you change the default from rsync to git, you'd be closer to
4 removing rsync, but it's not as drastic as a sudden removal. Would
5 give time to make sure it works properly without the risk of breaking
6 everything.
7
8 On Tue, Dec 18, 2018 at 10:37 AM Alec Warner <antarus@g.o> wrote:
9 >
10 >
11 >
12 > On Tue, Dec 18, 2018 at 1:15 PM Brian Evans <grknight@g.o> wrote:
13 >>
14 >> On 12/15/2018 11:15 PM, Alec Warner wrote:
15 >> > Hi,
16 >> >
17 >> > I am currently embarking on a plan to redo our existing rsync[0] mirror
18 >> > network. The current network has aged a bit. Its likely too large and is
19 >> > under-maintained. I think in the ideal case we would instead pivot this
20 >> > project to scaling out our git mirror capabilities and slowly migrate
21 >> > all consumers to pulling the git tree directly. To that end, I'm looking
22 >> > for blockers as to why various customers cannot switch to pulling the
23 >> > gentoo ebuild repository from git[1] instead of rsync.
24 >> >
25 >> > So for example:
26 >> >
27 >> > - bandwidth concerns (preferably with documentation / data.)
28 >> > - Firewall concerns
29 >> > - CPU concerns (e.g. rsync is great for tiny systems?)
30 >> > - Disk usage for git vs rsync
31 >> > - Other things i have not thought of.
32 >> >
33 >> > -A
34 >> >
35 >> > [0] This excludes emerge-webrsync; which I don't plan on touching.
36 >> > [1] Rich talked about some downsides earlier
37 >> > at https://lwn.net/Articles/759539/; but while these are challenges
38 >> > (some fixable) they are not necessarily blockers.
39 >>
40 >> I personally would be sad to see rsync go as I use the git developer
41 >> tree as my main repository on 2 machines. This is so I can develop and
42 >> update from the single source. These have no news or md5-cache and it
43 >> can be painful to generate metadata on one of them.
44 >
45 >
46 > So my strawperson response is that you should have 2 repos.
47 >
48 > PORTDIR=https://gitweb.gentoo.org/repo/sync/gentoo.git/log/?h=master # a local copy of this thing.
49 > PORTDIR_OVERLAY=/path/to/your/checkout/of/gentoo.git
50 >
51 > I suspect however that this likely performs ...poorly, particularly in worst case situations as the 'overlay' would of course be massive in this configuration.
52 >
53 >>
54 >>
55 >> I rely on scripts to pull down the rsync metadata to expedite this
56 >> process. eg. rsync <host>/gentoo-portage/metadata/md5-cache/. Git has
57 >> no easy sub-tree download equivalent that I know of.
58 >
59 >
60 > So I think overlaying the news and GSLA bits are easy (you have a post-sync script that cd's into various directories and clones the news and GSLA repos.) The costly bit is likely the metadata regeneration for your development branch of the tree. I'd be curious to see how much this costs (both cold and hot) for you to generate locally.
61 >
62 > -A
63 >
64 >>
65 >>
66 >> Brian
67 >>

Replies