Gentoo Archives: gentoo-user

From: Rich Freeman <rich0@g.o>
To: gentoo-user@l.g.o
Subject: Re: Re[2]: [gentoo-user] Re: Portage, git and shallow cloning
Date: Fri, 06 Jul 2018 11:47:33
Message-Id: CAGfcS_mPS1b4BwkfaThE7tJ=kDKPG29qakSSCxpy3VG75whkVA@mail.gmail.com
In Reply to: Re[2]: [gentoo-user] Re: Portage, git and shallow cloning by Davyd McColl
1 On Fri, Jul 6, 2018 at 4:34 AM Davyd McColl <davydm@×××××.com> wrote:
2 >
3 > I understand that git history will build over time -- I'm less concerned
4 > with (eventual) disk usage than I am with the speed of `emerge --sync`,
5 > which (and perhaps I'm sorely mistaken) appeared to be faster using git
6 > than rsync -- hence my choice of git over rsync (the discussion at
7 > https://forums.gentoo.org/viewtopic-t-1009562.html shows me to not be
8 > alone in this experience).
9 >
10
11 From what I've generally seen/heard git is much more efficient as long
12 as you sync frequently.
13
14 rsync has the advantage that it only transfers the minimum necessary
15 to get you from the tree you have now to the tree that is current. To
16 do this it has to stat every file (using default settings - you can
17 make it even slower if you want to), which is a lot of file I/O.
18
19 git has the advantage that it can just read the current HEAD and from
20 that know exactly what commits are missing, so there is way less
21 effort spent figuring out what changed. It has the disadvantage that
22 it sends everything that happened since your last sync, which could
23 include files that were created and subsequently removed. If you sync
24 often there won't be much of that, but if you're syncing monthly or
25 even less frequently then you probably will spend a lot of time
26 transmitting churn.
27
28 It is possible to trim down a repository, and as long as nobody is
29 doing force pushes on the main repo you should still be able to sync.
30 However, that is not something that just involves a git one-liner.
31 Personally I don't mind the space tradeoff, especially in exchange for
32 the IO tradeoff. A sync is always a VERY fast operation.
33
34 I'll also note that the stable branch (which is always free of obvious
35 issues caused by devs not running repoman) is only available via git.
36 There is no reason that couldn't be replicated via rsync, but right
37 now we only have one set of mirrors.
38
39 I'm still syncing from github after enabling signature checking.
40 There is a patch that will make that more secure but in the meantime
41 my scripts keep an eye on exit status when I sync. IMO signature
42 checking is more important than where you sync from - as long as gpg
43 says I'm good it really doesn't matter who has the ability to play
44 with the data enroute. But, it certainly doesn't hurt to sync from
45 infra (I do have concerns for whether infra could handle everybody
46 doing it though - github is MS's problem to worry about).
47
48 --
49 Rich

Replies

Subject Author
Re[4]: [gentoo-user] Re: Portage, git and shallow cloning Davyd McColl <davydm@×××××.com>
[gentoo-user] Re: Re[2]: Re: Portage, git and shallow cloning Martin Vaeth <martin@×××××.de>