1 |
On Tue, Dec 18, 2018 at 1:15 PM Brian Evans <grknight@g.o> wrote: |
2 |
|
3 |
> On 12/15/2018 11:15 PM, Alec Warner wrote: |
4 |
> > Hi, |
5 |
> > |
6 |
> > I am currently embarking on a plan to redo our existing rsync[0] mirror |
7 |
> > network. The current network has aged a bit. Its likely too large and is |
8 |
> > under-maintained. I think in the ideal case we would instead pivot this |
9 |
> > project to scaling out our git mirror capabilities and slowly migrate |
10 |
> > all consumers to pulling the git tree directly. To that end, I'm looking |
11 |
> > for blockers as to why various customers cannot switch to pulling the |
12 |
> > gentoo ebuild repository from git[1] instead of rsync. |
13 |
> > |
14 |
> > So for example: |
15 |
> > |
16 |
> > - bandwidth concerns (preferably with documentation / data.) |
17 |
> > - Firewall concerns |
18 |
> > - CPU concerns (e.g. rsync is great for tiny systems?) |
19 |
> > - Disk usage for git vs rsync |
20 |
> > - Other things i have not thought of. |
21 |
> > |
22 |
> > -A |
23 |
> > |
24 |
> > [0] This excludes emerge-webrsync; which I don't plan on touching. |
25 |
> > [1] Rich talked about some downsides earlier |
26 |
> > at https://lwn.net/Articles/759539/; but while these are challenges |
27 |
> > (some fixable) they are not necessarily blockers. |
28 |
> |
29 |
> I personally would be sad to see rsync go as I use the git developer |
30 |
> tree as my main repository on 2 machines. This is so I can develop and |
31 |
> update from the single source. These have no news or md5-cache and it |
32 |
> can be painful to generate metadata on one of them. |
33 |
> |
34 |
|
35 |
So my strawperson response is that you should have 2 repos. |
36 |
|
37 |
PORTDIR=https://gitweb.gentoo.org/repo/sync/gentoo.git/log/?h=master # a |
38 |
local copy of this thing. |
39 |
PORTDIR_OVERLAY=/path/to/your/checkout/of/gentoo.git |
40 |
|
41 |
I suspect however that this likely performs ...poorly, particularly in |
42 |
worst case situations as the 'overlay' would of course be massive in this |
43 |
configuration. |
44 |
|
45 |
|
46 |
> |
47 |
> I rely on scripts to pull down the rsync metadata to expedite this |
48 |
> process. eg. rsync <host>/gentoo-portage/metadata/md5-cache/. Git has |
49 |
> no easy sub-tree download equivalent that I know of. |
50 |
> |
51 |
|
52 |
So I think overlaying the news and GSLA bits are easy (you have a post-sync |
53 |
script that cd's into various directories and clones the news and GSLA |
54 |
repos.) The costly bit is likely the metadata regeneration for your |
55 |
development branch of the tree. I'd be curious to see how much this costs |
56 |
(both cold and hot) for you to generate locally. |
57 |
|
58 |
-A |
59 |
|
60 |
|
61 |
> |
62 |
> Brian |
63 |
> |
64 |
> |