1 |
On Sat, Feb 27, 2016 at 02:14:12PM +0100, Luca Barbato wrote: |
2 |
> On 24/02/16 01:33, Duncan wrote: |
3 |
> > That option is there, and indeed, a patch providing it was specifically |
4 |
> > added to portage for infra to use, because appending entries to existing |
5 |
> > files is vastly easier and more performant than trying to prepend entries |
6 |
> > and having to rewrite the entire file as a result. |
7 |
> This sounds wrong in many different ways. The changelog files are tiny |
8 |
> and makes next to no difference truncate+write or append. |
9 |
Prior to seperating ChangeLog files into years, this was way worse: |
10 |
a kernel bump present in any of gentoo-sources, hardened-sources, |
11 |
vanilla-sources meant another 100k of data to sent. It's not a lot |
12 |
overall, but here's some quick stats from one of our rsync servers, on |
13 |
bytes sent. |
14 |
|
15 |
Stats for Feb 25, from one of the 3 primary rsync.g.o servers, on the |
16 |
'bytes sent' output from rsyncd. |
17 |
|
18 |
rsyncd example output: |
19 |
Feb 25 00:03:17 quetzal rsyncd[27280]: sent 4930260 bytes received 32215 bytes total size 408174052 |
20 |
|
21 |
3909 entries. |
22 |
|
23 |
Min RAW size: 4833709 bytes [1] |
24 |
Median RAW size: 22436094 bytes. |
25 |
Mean RAW size: 45652781 bytes. |
26 |
Sum of RAW size: 178456721459 bytes = ~166GiB (per day!) |
27 |
|
28 |
The min possible transfer size is forcing an rsync with no changes; it |
29 |
just sends the metadata about the files (path, mtime, size, etc). |
30 |
|
31 |
Let's subtract that from all the rest of the entries, to get stats about |
32 |
the data transfer. |
33 |
|
34 |
Median data size: 17602385 bytes |
35 |
Mean data size: 40819072 bytes |
36 |
|
37 |
So, now the question: |
38 |
If we use appending changelogs, the large changelogs only differ by a |
39 |
few hundred bytes. If we instead have to rewrite them, it's 50k+ per |
40 |
changelog. |
41 |
|
42 |
For each 50k changelog, the median transfer would get 0.25% larger. |
43 |
|
44 |
-- |
45 |
Robin Hugh Johnson |
46 |
Gentoo Linux: Developer, Infrastructure Lead, Foundation Trustee |
47 |
E-Mail : robbat2@g.o |
48 |
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 |