1 |
On Tue, Feb 23, 2016 at 7:50 PM, Kristian Fiskerstrand <k_f@g.o> wrote: |
2 |
> |
3 |
> On 02/24/2016 01:33 AM, Duncan wrote: |
4 |
>> |
5 |
>> IMO, what's actually happening here is the slow deprecation of |
6 |
>> rsync mirrors in favor of git. I doubt they'd be created at all |
7 |
>> if gentoo were |
8 |
> |
9 |
> I don't agree to this at all. For one thing git is very resource |
10 |
> intensive compared to rsync mirroring, |
11 |
|
12 |
Is this actually true? For the typical use case of daily or close to |
13 |
daily updates I'd think that git would be much more efficient. |
14 |
|
15 |
rsync has to traverse an entire directory tree (both client and |
16 |
server-side, though of course either could have it cached) and |
17 |
synchronize across the network the metadata for every file to |
18 |
determine what has changed, and then figure out what changed in each |
19 |
file and transfer it. With a large git repository with only a few |
20 |
hundred new commits the client just tells the server what its last |
21 |
commit is, the server walks back in history to find it, and then the |
22 |
server can quickly identify all the new commits/trees/blobs and send |
23 |
just those. With the COW design of git this is very efficient, not |
24 |
requiring traversing any subdirectory in which no files have changed. |
25 |
|
26 |
In the degenerate case where nothing has changed, an rsync still needs |
27 |
to walk the full tree and send a file list, while git just sends a |
28 |
commit ID and terminates. |
29 |
|
30 |
Now, for an infrequent sync (think months) where most of the tree has |
31 |
changed I could certainly buy that a webrsync would be far more |
32 |
efficient for everybody. |
33 |
|
34 |
And just like rsync git is easy to mirror, with github being an |
35 |
example of a service that will mirror anybody's repo for free and they |
36 |
seem to have no end to their bandwidth (though I've found that pushing |
37 |
a full historical gentoo git tree to them does make them choke on it |
38 |
for about 30min before it shows up). |
39 |
|
40 |
So, while I'll agree with the validity of your other points, I'd be |
41 |
interested in actual data to back up the resource claim. I could see |
42 |
that going either way, and that is likely to be based on how |
43 |
well-optimized everything is. Linus did a pretty good job with git. |
44 |
|
45 |
> For one thing we can't expect users to keep an up |
46 |
> to date copy of all gentoo developer's OpenPGP keys to verify each git |
47 |
> commit, additionally this will cause issues with retirement and |
48 |
> similar situations (certificate revocation, subkey rotations, expiries). |
49 |
|
50 |
Well, we could do something (eventually) to make tracking keys easier, |
51 |
but I'll still buy that the thick manifests are more secure. Git |
52 |
commit signatures are only bound to their contents with sha1. I get |
53 |
that nobody has demonstrated a practical attack on that, but I think |
54 |
most crypto experts wouldn't heartily endorse the design. |
55 |
|
56 |
Keep in mind that we do have git mirrors that include metadata/etc |
57 |
hosted on Github. I know people have concerns with their software |
58 |
being proprietary but as far as syncing goes it is just a mirror. I |
59 |
doubt most of us audit all the distfiles mirrors we use to make sure |
60 |
they're only using FOSS ftp/http servers and so on. There really |
61 |
isn't any reason that it couldn't be hosted on infra either, assuming |
62 |
they wanted the extra load (and I don't see the point in it, since it |
63 |
is just a mirror, and if it ever goes away it is trivial to just point |
64 |
the scripts that generate it to push to some other mirror instead - |
65 |
git itself is completely FOSS). |
66 |
|
67 |
Again, I have nothing against devs maintaining rsync and changelogs, |
68 |
and users making use of them. I just don't see it as the end of the |
69 |
world if devs decide to stop taking care of them. |
70 |
|
71 |
-- |
72 |
Rich |