1 |
On Fri, Jul 6, 2018 at 4:34 AM Davyd McColl <davydm@×××××.com> wrote: |
2 |
> |
3 |
> I understand that git history will build over time -- I'm less concerned |
4 |
> with (eventual) disk usage than I am with the speed of `emerge --sync`, |
5 |
> which (and perhaps I'm sorely mistaken) appeared to be faster using git |
6 |
> than rsync -- hence my choice of git over rsync (the discussion at |
7 |
> https://forums.gentoo.org/viewtopic-t-1009562.html shows me to not be |
8 |
> alone in this experience). |
9 |
> |
10 |
|
11 |
From what I've generally seen/heard git is much more efficient as long |
12 |
as you sync frequently. |
13 |
|
14 |
rsync has the advantage that it only transfers the minimum necessary |
15 |
to get you from the tree you have now to the tree that is current. To |
16 |
do this it has to stat every file (using default settings - you can |
17 |
make it even slower if you want to), which is a lot of file I/O. |
18 |
|
19 |
git has the advantage that it can just read the current HEAD and from |
20 |
that know exactly what commits are missing, so there is way less |
21 |
effort spent figuring out what changed. It has the disadvantage that |
22 |
it sends everything that happened since your last sync, which could |
23 |
include files that were created and subsequently removed. If you sync |
24 |
often there won't be much of that, but if you're syncing monthly or |
25 |
even less frequently then you probably will spend a lot of time |
26 |
transmitting churn. |
27 |
|
28 |
It is possible to trim down a repository, and as long as nobody is |
29 |
doing force pushes on the main repo you should still be able to sync. |
30 |
However, that is not something that just involves a git one-liner. |
31 |
Personally I don't mind the space tradeoff, especially in exchange for |
32 |
the IO tradeoff. A sync is always a VERY fast operation. |
33 |
|
34 |
I'll also note that the stable branch (which is always free of obvious |
35 |
issues caused by devs not running repoman) is only available via git. |
36 |
There is no reason that couldn't be replicated via rsync, but right |
37 |
now we only have one set of mirrors. |
38 |
|
39 |
I'm still syncing from github after enabling signature checking. |
40 |
There is a patch that will make that more secure but in the meantime |
41 |
my scripts keep an eye on exit status when I sync. IMO signature |
42 |
checking is more important than where you sync from - as long as gpg |
43 |
says I'm good it really doesn't matter who has the ability to play |
44 |
with the data enroute. But, it certainly doesn't hurt to sync from |
45 |
infra (I do have concerns for whether infra could handle everybody |
46 |
doing it though - github is MS's problem to worry about). |
47 |
|
48 |
-- |
49 |
Rich |