1 |
On Sat, 5 Nov 2016 15:56:20 -0700 |
2 |
Zac Medico <zmedico@g.o> wrote: |
3 |
|
4 |
> On 11/05/2016 03:22 PM, Michał Górny wrote: |
5 |
> > On Sat, 5 Nov 2016 15:11:10 -0700 |
6 |
> > Zac Medico <zmedico@g.o> wrote: |
7 |
> > |
8 |
> >> On 11/05/2016 02:50 PM, Michał Górny wrote: |
9 |
> >>> On Sat, 5 Nov 2016 13:43:15 -0700 |
10 |
> >>> Zac Medico <zmedico@g.o> wrote: |
11 |
> >>> |
12 |
> >>>> This is necessary in order to avoid "There are too many unreachable |
13 |
> >>>> loose objects" warnings from automatic git gc calls. |
14 |
> >>>> |
15 |
> >>>> X-Gentoo-Bug: 599008 |
16 |
> >>>> X-Gentoo-Bug-URL: https://bugs.gentoo.org/show_bug.cgi?id=599008 |
17 |
> >>>> --- |
18 |
> >>>> pym/portage/sync/modules/git/git.py | 6 ++++++ |
19 |
> >>>> 1 file changed, 6 insertions(+) |
20 |
> >>>> |
21 |
> >>>> diff --git a/pym/portage/sync/modules/git/git.py b/pym/portage/sync/modules/git/git.py |
22 |
> >>>> index f288733..c90cf88 100644 |
23 |
> >>>> --- a/pym/portage/sync/modules/git/git.py |
24 |
> >>>> +++ b/pym/portage/sync/modules/git/git.py |
25 |
> >>>> @@ -101,6 +101,12 @@ class GitSync(NewBase): |
26 |
> >>>> writemsg_level(msg + "\n", level=logging.ERROR, noiselevel=-1) |
27 |
> >>>> return (e.returncode, False) |
28 |
> >>>> |
29 |
> >>>> + # For shallow fetch, unreachable objects must be pruned |
30 |
> >>>> + # manually, since otherwise automatic git gc calls will |
31 |
> >>>> + # eventually warn about them (see bug 599008). |
32 |
> >>>> + subprocess.call(['git', 'prune'], |
33 |
> >>>> + cwd=portage._unicode_encode(self.repo.location)) |
34 |
> >>>> + |
35 |
> >>>> git_cmd_opts += " --depth %d" % self.repo.sync_depth |
36 |
> >>>> git_cmd = "%s fetch %s%s" % (self.bin_command, |
37 |
> >>>> remote_branch.partition('/')[0], git_cmd_opts) |
38 |
> >>> |
39 |
> >>> Does it have a performance impact? |
40 |
> >> |
41 |
> >> Yes, it takes about 20 seconds on my laptop. I suppose we could make |
42 |
> >> this an optional thing, so that those people can do it manually if they |
43 |
> >> want. |
44 |
> > |
45 |
> > So we have improvement from at most few seconds for normal 'git pull' |
46 |
> > to around a minute for shallow pull? |
47 |
> |
48 |
> Well we've got a least 3 resources to consider: |
49 |
> |
50 |
> 1) network bandwidth |
51 |
> 2) disk usage |
52 |
> 3) sync time |
53 |
> |
54 |
> For me, sync time doesn't really matter that much, but I suppose it |
55 |
> might for some people. |
56 |
|
57 |
For a common user, network bandwidth is not a problem with git (except |
58 |
maybe for the huge initial clone). Especially when syncing frequently, |
59 |
the gain from subsequent --depth=1 is negligible. When syncing rarely, |
60 |
you probably prefer snapshots anyway. |
61 |
|
62 |
I doubt this could be of benefit even to dial-up users; that is, |
63 |
that more time would be saved on fetching than lost on all the ops |
64 |
needed to make things continue to work. The additional data won't |
65 |
affect the data plan users much probably either. |
66 |
|
67 |
Especially that Gentoo is all about fetching distfiles that are huge |
68 |
compared to the git updates for the repository. |
69 |
|
70 |
As for the disk usage, again, the difference should be negligible. |
71 |
The major difference is done on initial fetch. Of course, regularly |
72 |
pruning the repository will reduce its size. But then, pruning it will |
73 |
non-shallow fetches would probably achieve a similar effect thanks to |
74 |
delta compression. |
75 |
|
76 |
That leaves the sync time. Which is becoming worse than rsync. |
77 |
|
78 |
-- |
79 |
Best regards, |
80 |
Michał Górny |
81 |
<http://dev.gentoo.org/~mgorny/> |