Gentoo Archives: gentoo-dev

From: "Robin H. Johnson" <robbat2@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Re: Bug #565566: Why is it still not fixed?
Date: Sat, 27 Feb 2016 22:51:00
Message-Id: robbat2-20160227T222504-652639091Z@orbis-terrarum.net
In Reply to: Re: [gentoo-dev] Re: Bug #565566: Why is it still not fixed? by Luca Barbato
1 On Sat, Feb 27, 2016 at 02:14:12PM +0100, Luca Barbato wrote:
2 > On 24/02/16 01:33, Duncan wrote:
3 > > That option is there, and indeed, a patch providing it was specifically
4 > > added to portage for infra to use, because appending entries to existing
5 > > files is vastly easier and more performant than trying to prepend entries
6 > > and having to rewrite the entire file as a result.
7 > This sounds wrong in many different ways. The changelog files are tiny
8 > and makes next to no difference truncate+write or append.
9 Prior to seperating ChangeLog files into years, this was way worse:
10 a kernel bump present in any of gentoo-sources, hardened-sources,
11 vanilla-sources meant another 100k of data to sent. It's not a lot
12 overall, but here's some quick stats from one of our rsync servers, on
13 bytes sent.
14
15 Stats for Feb 25, from one of the 3 primary rsync.g.o servers, on the
16 'bytes sent' output from rsyncd.
17
18 rsyncd example output:
19 Feb 25 00:03:17 quetzal rsyncd[27280]: sent 4930260 bytes received 32215 bytes total size 408174052
20
21 3909 entries.
22
23 Min RAW size: 4833709 bytes [1]
24 Median RAW size: 22436094 bytes.
25 Mean RAW size: 45652781 bytes.
26 Sum of RAW size: 178456721459 bytes = ~166GiB (per day!)
27
28 The min possible transfer size is forcing an rsync with no changes; it
29 just sends the metadata about the files (path, mtime, size, etc).
30
31 Let's subtract that from all the rest of the entries, to get stats about
32 the data transfer.
33
34 Median data size: 17602385 bytes
35 Mean data size: 40819072 bytes
36
37 So, now the question:
38 If we use appending changelogs, the large changelogs only differ by a
39 few hundred bytes. If we instead have to rewrite them, it's 50k+ per
40 changelog.
41
42 For each 50k changelog, the median transfer would get 0.25% larger.
43
44 --
45 Robin Hugh Johnson
46 Gentoo Linux: Developer, Infrastructure Lead, Foundation Trustee
47 E-Mail : robbat2@g.o
48 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85

Replies

Subject Author
Re: [gentoo-dev] Re: Bug #565566: Why is it still not fixed? Patrick Lauer <patrick@g.o>