Gentoo Archives: gentoo-dev

From: "Robin H. Johnson" <robbat2@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Put DESCRIPTION HOMEPAGE and LICENSE in another place
Date: Thu, 11 Aug 2005 05:26:54
Message-Id: 20050811052348.GA11298@curie-int.orbis-terrarum.net
In Reply to: [gentoo-dev] Put DESCRIPTION HOMEPAGE and LICENSE in another place by Carlos Silva
1 On Thu, Aug 11, 2005 at 01:04:25AM +0100, Carlos Silva wrote:
2 [snip]
3 > or metadata.xml. This way, users with slow connections don't download
4 > almost 1MB of info every time they sync.
5 Yes, your example occupies 1MB of space.
6 However, it does NOT equate to 1MB of bandwidth with each sync.
7
8 If you go and dig up the rsync stats output, you will get some data
9 like this:
10
11 (I generated this artificaily, it's the difference between the 20050801 and
12 20050802 snapshots, as distributed by rsync)
13 ====
14 Number of files: 119194
15 Number of files transferred: 923
16 Total file size: 94682691 bytes
17 Total transferred file size: 1956000 bytes
18 Literal data: 1956000 bytes
19 Matched data: 0 bytes
20 File list size: 2898171
21 File list generation time: 40.292 seconds
22 File list transfer time: 0.000 seconds
23 Total bytes sent: 24233
24 Total bytes received: 3662519
25 sent 24233 bytes received 3662519 bytes 74479.84 bytes/sec
26 total size is 94682691 speedup is 25.68
27 ====
28
29 Now look at the recived size data. We got a total of 3662519 bytes. Of that,
30 2898171 bytes was the file list alone. The file list accounts for 79% of the
31 traffic!
32
33 One of the other reasons for the actual file traffic being negliable is rsync's
34 compression, over a set of files being transferred, 'DESCRIPTION="GPL-2"'
35 should turn up often enough that it gets compressed (probably as two symbols,
36 as 'DESCRIPTION="' is more common).
37
38 The diff between 20050801 and 20050802 is only 862668 bytes (uncompressed) (and
39 157728 bytes when bzip2'd), so either rsync needs some serious work done in
40 it's file list code, or we should consider if rsync is still the best fit for
41 portage.
42
43 Another alternative would also be to reduce the number of the files in the tree.
44 (Merging digests and manifests would shave off ~20k files, converting the
45 metadata cache files into large single files would shave off another ~20k).
46
47 --
48 Robin Hugh Johnson
49 E-Mail : robbat2@××××××××××××××.net
50 Home Page : http://www.orbis-terrarum.net/?l=people.robbat2
51 ICQ# : 30269588 or 41961639
52 GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85

Replies