Gentoo Archives: gentoo-dev

From:	Ned Ludd <solar@g.o>
To:	"Robin H. Johnson" <robbat2@g.o>
Cc:	Gentoo Developers <gentoo-dev@l.g.o>
Subject:	Re: [gentoo-dev] A few modest suggestions regarding tree size
Date:	Thu, 14 Oct 2004 14:08:43
Message-Id:	`1097762832.1944.39.camel@simple`
In Reply to:	Re: [gentoo-dev] A few modest suggestions regarding tree size by "Robin H. Johnson"

1	On Wed, 2004-10-13 at 03:31, Robin H. Johnson wrote:
2	> On Wed, Oct 13, 2004 at 09:01:06AM +0200, Spider wrote:
3	> > > For real benefits, reducing the number of files, or using a filesystem
4	> > > that performs tail packing reduces the amount of disk seek that must
5	> > > be done, really increases performance given the number of small files.
6	> This is still applicable to your method as well.
7	>
8	> The one thing that your (previously known) method does bring out is that
9	> reducing the I/O required really helps.
10	>
11	> > Well, here's another method ;)
12	> >
13	> > /root/portage.img on /usr/portage type ext2 (rw,noatime,loop=/dev/loop0)
14	> > -rw-r--r-- 1 root root 293M Oct 12 23:17 /root/portage.img
15	> > /root/portage.img 257M 195M 62M 77% /usr/portage
16	> >
17	> >
18	> > some varied interesting things from tune2fs -l
19	> > Filesystem features: dir_index sparse_super
20	> > Inode count: 300144
21	> > Block count: 300000
22	> > Free blocks: 62825
23	> > Free inodes: 154512
24	> > Block size: 1024
25	> > Fragment size: 1024
26	> Pack it into a loopback reiserfs instead, way better performance. For
27	> an even bigger boost, put the loop file into tmpfs or use some other
28	> direct memory scheme.
29	>
30	> See:
31	> http://dev.gentoo.org/~robbat2/fastcvstest
32	>
33	> I developed the above when I was working on super-fast CVS repositories,
34	> as I needed my client to not be the bottleneck ;-). My record for a
35	> complete CVS checkout of gentoo-x86 (over the network to a remote
36	> client), stands at 65 seconds. This is quite a bit more work than an
37	> rsync checkout as well.
38	>
39	> Provided you can assure only a single client is using the loopback
40	> system, here is a very good way of keeping it fast, but not needing the
41	> network traffic of a full checkout:
42	> portage loop file is usually on disk, when a sync is needed:
43	> 1. umount loop file
44	> 2. copy loop file to /dev/shm or other fast place
45	> 3. mount loop file again (from new location)
46	> 4. run updates to loop filesystem ('cvs up; emerge metadata' or 'emerge sync')
47	> 5. umount loop file, copy back to disk
48	> 6. mount loop file again
49	>
50	> The optimal reiserfs mount options are approximately:
51	> noexec,nosuid,nodev,noatime,nodiratime,nolog
52	>
53	> Your performance may vary with nolog, I use it for the workload of the
54	> CVS server tmpdir, which is a very frequent creation of 50,000 tiny
55	> files [for every checkout/update].
56	>
57	> Solar has been doing work on putting the contents of the tree into a
58	> read-only squashfs filesystem and distributing that.
59
60	New loopback size is 11M after reading this thread and dumping ChangeLog
61	& metadata.xml files which does seem like a perfectly feasible thing for
62	us to do. Removing leading/trailing whitespace and erroneous newlines
63	yielded no noticeable gains.
64
65	For fun I took it a step further to see what we could get if we moved
66	away from having locally stored digest/Manifest files then re-compressed
67	and got the portage tree down 8.5M. Yeah that's 8.5M down from Spiders
68	195M at a cost savings of 187.5M. I don't think dumping the
69	digest/Manifest would be to feasible at this time however.
70
71	--
72	Ned Ludd <solar@g.o>
73	Gentoo (hardened,security,infrastructure,embedded,toolchain) Developer

Attachments

File name	MIME type
signature.asc	application/pgp-signature

Report Message

Find on MARC Find on Google Groups