1 |
On Wed, Apr 27, 2022 at 10:22 AM Grant Edwards |
2 |
<grant.b.edwards@×××××.com> wrote: |
3 |
> |
4 |
> Is there any advantage (either to me or the Gentoo community) to |
5 |
> continue to use rsync and the rsync pool instead of switching the |
6 |
> rest of my machines to git? |
7 |
> |
8 |
> I've been very impressed with the reliability and speed of sync |
9 |
> operations using git they never take more than a few seconds. |
10 |
|
11 |
With git you might need to occasionally wipe your repository to delete |
12 |
history if you don't want it to accumulate (I don't think there is a |
13 |
way to do that automatically but if you can tell git to drop history |
14 |
let me know). |
15 |
|
16 |
Of course that history can come in handy if you need to revert something/etc. |
17 |
|
18 |
If you sync infrequently - say once a month or less frequently, then |
19 |
I'd expect rsync to be faster. This is because git has to fetch every |
20 |
single set of changes since the last sync, while rsync just compares |
21 |
everything at a file level. Over a long period of time that means |
22 |
that if a package was revised 4 times and old versions were pruned 4 |
23 |
times, then you end up fetching and ignoring 2-3 versions of the |
24 |
package that would just never be fetched at all with rsync. That can |
25 |
add up if it has been a long time. |
26 |
|
27 |
On the other hand, if you sync frequently (especially daily or more |
28 |
often), then git is FAR less expensive in both IO and CPU on both your |
29 |
side and on the server side. Your git client and the server just |
30 |
communicate what revision they're at, the server can see all the |
31 |
versions you're missing, and send the history in-between. Then your |
32 |
client can see what objects it is missing that it wants and fetch |
33 |
them. Since it is all de-duped by its design anything that hasn't |
34 |
changed or which the repo has already seen will not need to be |
35 |
transferred. With rsync you need to scan the entire filesystem |
36 |
metadata at least on both ends to figure out what has changed, and if |
37 |
your metadata isn't trustworthy you need to hash all the file contents |
38 |
(which isn't done by default). Since git is content-hashed you |
39 |
basically get more data integrity than the default level for rsync and |
40 |
the only thing that needs to be read is the git metadata, which is |
41 |
packed efficiently. |
42 |
|
43 |
Bottom line is that I think git just makes more sense these days for |
44 |
the typical gentoo user, who is far more likely to be interested in |
45 |
things like changelogs and commit histories than users of other |
46 |
distros. I'm not saying it is always the best choice for everybody, |
47 |
but you should consider it and improve your git-fu if you need to. |
48 |
Oh, and if you want the equivalent of an old changelog, just go into a |
49 |
directory and run "git whatchanged ." |
50 |
|
51 |
-- |
52 |
Rich |