1 |
On Wed, 2004-10-13 at 03:31, Robin H. Johnson wrote: |
2 |
> On Wed, Oct 13, 2004 at 09:01:06AM +0200, Spider wrote: |
3 |
> > > For real benefits, reducing the number of files, or using a filesystem |
4 |
> > > that performs tail packing reduces the amount of disk seek that must |
5 |
> > > be done, really increases performance given the number of small files. |
6 |
> This is still applicable to your method as well. |
7 |
> |
8 |
> The one thing that your (previously known) method does bring out is that |
9 |
> reducing the I/O required really helps. |
10 |
> |
11 |
> > Well, here's another method ;) |
12 |
> > |
13 |
> > /root/portage.img on /usr/portage type ext2 (rw,noatime,loop=/dev/loop0) |
14 |
> > -rw-r--r-- 1 root root 293M Oct 12 23:17 /root/portage.img |
15 |
> > /root/portage.img 257M 195M 62M 77% /usr/portage |
16 |
> > |
17 |
> > |
18 |
> > some varied interesting things from tune2fs -l |
19 |
> > Filesystem features: dir_index sparse_super |
20 |
> > Inode count: 300144 |
21 |
> > Block count: 300000 |
22 |
> > Free blocks: 62825 |
23 |
> > Free inodes: 154512 |
24 |
> > Block size: 1024 |
25 |
> > Fragment size: 1024 |
26 |
> Pack it into a loopback reiserfs instead, way better performance. For |
27 |
> an even bigger boost, put the loop file into tmpfs or use some other |
28 |
> direct memory scheme. |
29 |
> |
30 |
> See: |
31 |
> http://dev.gentoo.org/~robbat2/fastcvstest |
32 |
> |
33 |
> I developed the above when I was working on super-fast CVS repositories, |
34 |
> as I needed my client to not be the bottleneck ;-). My record for a |
35 |
> complete CVS checkout of gentoo-x86 (over the network to a remote |
36 |
> client), stands at 65 seconds. This is quite a bit more work than an |
37 |
> rsync checkout as well. |
38 |
> |
39 |
> Provided you can assure only a single client is using the loopback |
40 |
> system, here is a very good way of keeping it fast, but not needing the |
41 |
> network traffic of a full checkout: |
42 |
> portage loop file is usually on disk, when a sync is needed: |
43 |
> 1. umount loop file |
44 |
> 2. copy loop file to /dev/shm or other fast place |
45 |
> 3. mount loop file again (from new location) |
46 |
> 4. run updates to loop filesystem ('cvs up; emerge metadata' or 'emerge sync') |
47 |
> 5. umount loop file, copy back to disk |
48 |
> 6. mount loop file again |
49 |
> |
50 |
> The optimal reiserfs mount options are approximately: |
51 |
> noexec,nosuid,nodev,noatime,nodiratime,nolog |
52 |
> |
53 |
> Your performance may vary with nolog, I use it for the workload of the |
54 |
> CVS server tmpdir, which is a very frequent creation of 50,000 tiny |
55 |
> files [for every checkout/update]. |
56 |
> |
57 |
> Solar has been doing work on putting the contents of the tree into a |
58 |
> read-only squashfs filesystem and distributing that. |
59 |
|
60 |
New loopback size is 11M after reading this thread and dumping ChangeLog |
61 |
& metadata.xml files which does seem like a perfectly feasible thing for |
62 |
us to do. Removing leading/trailing whitespace and erroneous newlines |
63 |
yielded no noticeable gains. |
64 |
|
65 |
For fun I took it a step further to see what we could get if we moved |
66 |
away from having locally stored digest/Manifest files then re-compressed |
67 |
and got the portage tree down 8.5M. Yeah that's 8.5M down from Spiders |
68 |
195M at a cost savings of 187.5M. I don't think dumping the |
69 |
digest/Manifest would be to feasible at this time however. |
70 |
|
71 |
-- |
72 |
Ned Ludd <solar@g.o> |
73 |
Gentoo (hardened,security,infrastructure,embedded,toolchain) Developer |