Gentoo Archives: gentoo-trustees

From: Kurt Lieber <klieber@g.o>
To: gentoo-trustees@l.g.o
Subject: [gentoo-trustees] How are we going to spend money?
Date: Sat, 22 May 2004 10:08:10
Message-Id: 20040522100914.GN26130@mail.lieber.org
1 All --
2
3 We haven't yet discussed how we're going to spend money (or decide how to
4 spend money) but below is a message from Robin that I think clearly
5 justifies spending some money on lark. It might be a good catalyst to get
6 us to figure out what our budgeting and expenditure approval process is
7 going to be. It can be as simple as someone (like me) posting an email
8 (like this) to this list or it can be somethimg more complex.
9
10 Ideas?
11
12 --kurt
13
14 ----- Forwarded message from "Robin H. Johnson" <robbat2@g.o> -----
15
16 Date: Sat, 22 May 2004 02:55:28 -0700
17 From: "Robin H. Johnson" <robbat2@g.o>
18 Subject: CVS performance improvements
19 To: Gentoo Infrastructure <gentoo-infrastructure@l.g.o>
20
21 Some weeks ago, in #-infra we had a discussion about improving the
22 performance of the CVS server.
23
24 Some months ago we already moved the CVS lockfiles to a tmpfs which provided a
25 very good speedup for many operations, but still left room for improvement.
26 Before the lockfile move, a full cvs up could take more than 45 minutes if the
27 server was very busy. The lockfile move got us down below 20 minutes the great
28 majority of the time. As the server has become busier, we've recently been
29 hitting 20 minutes as the average time for a cvs up again.
30
31 Outside of the CVS tree itself, CVS uses two things:
32 lockfiles - (detailed above)
33 tmpfiles - stored in /tmp/cvs*/ by default, CVS figures out what it will
34 send you by building stuff here - a mixture of what you upload to it on
35 your file status and the latest file status.
36
37 The contents of tmpfiles don't last much longer than the duration of a
38 single checkout, so it's suited for caching in memory, but there is a
39 problem with this. It is a LOT of files. It's usual space allocation is
40 ~4 very small files (50-200 bytes ea.) for each directory - approx 1kb
41 of actual data per directory. We have 18812 directories for our
42 gentoo-x86 module presently, so this is a LOT of wasted space when using
43 4kb blocks. Checkouts are presently all on disk, and use up this space
44 already when they occur. ~300mb for a single 'cvs co gentoo-x86'.
45 However due to the large number of files, esp. when we have multiple
46 checkouts/updates going, performance really hurts on /tmp (which is on
47 our system drive).
48
49 Before when were wondering how to solve this bottleneck, we considered added
50 more ram and using tmpfs. tmpfs is locked at using 4kb blocks due to how it's
51 designed, so we'd need huge amounts of memory - we frequently have 5-8
52 simultaneous large updates going (5*300mb = 1.5gb).
53
54 As a quick profiling of CVS usage, I processed the some logs for the last 14
55 days, and excluded the gmirror, gweb and pylon users as they look to be using
56 scripts extremely and throwing off some statistics.
57
58 Top 5 users:
59 Total Runs | Username
60 ---------------------
61 1350 | gweb
62 1382 | vapier
63 1549 | mr_bones_
64 1997 | gmirror
65 2765 | pylon
66
67 UTC | Average CVS
68 Hour | runs per hour
69 ---------------------
70 00h | 72.571
71 01h | 62.357
72 02h | 53.786
73 03h | 245.786
74 04h | 51.357
75 05h | 63.857
76 06h | 43.214
77 07h | 56.786
78 08h | 47.786
79 09h | 46.714
80 10h | 42.786
81 11h | 50.643
82 12h | 62.714
83 13h | 58.643
84 14h | 73.000
85 15h | 77.357
86 16h | 69.500
87 17h | 82.571
88 18h | 117.714
89 19h | 66.643
90 20h | 75.857
91 21h | 65.143
92 22h | 66.143
93 23h | 77.857
94
95 In the above numbers, ~200 of the daily items are from pylon all around 3am.
96 gweb and gmirror together account for 10 runs / hour. Without pylon, gweb,
97 gmirror, we have ~1300 developer-initiated CVS runs each day. (I don't see any
98 other obvious script accesses). Of that, some 20% are cvs updates.
99
100 Looking to cut down on the wasted tail space, I ran some more tests on lark
101 today. I originally wanted to use ramdisk as a module (so I can specify the
102 size to 128mb), but this isn't possible as we have ramdisk compiled in for
103 initrd. Instead, I've utilized some of the MTD support in the kernel (all
104 added as modules), which provides a similar functionality to a ramdisk.
105 I loaded the MTD as a 64mb chunk in memory, and formatted with reiserfs, so I
106 could try out tail-packing for space saving.
107
108 A full 'cvs co gentoo-x86' needs 20mb of space with. An a full update of your
109 tree if you are very out of date (a month or more) could use ~50mb of space.
110 A single cvs checkin won't take more than 100kb absolute worst case.
111
112 I believe the MTD+reiserfs route is definetly much more cost-effective on RAM
113 than tmpfs when the system is busy, but it has the disadvantage that when the
114 system isn't busy, the kernel would have reclaimed the tmpfs space, but can't
115 do so with the MTD space.
116
117 Results from a testing run of MTD+reiserfs running on my user only:
118 (did 3 cvs up on month-old repos, and 2 full cvs co)
119 - 5-8 minutes for cvs up/co !! This is a two to three times faster than before,
120 and it won't increase significently in future, or with large numbers of users,
121 as memory has constant access time :-).
122
123 For this testing, I had 8 other developers doing either cvs up or cvs co at the
124 same time, and the i/o usage pushed loadavg to ~15 on lark.
125
126 There is one remaing problem, namely I can't get MTD to provide me with a block
127 larger than 64mb, or multiple blocks that span beyond 64mb. I'll play with some
128 of the MTD stuff at work this week (on a box with lots of memory), and see what
129 is needed to use lots of memory with MTD.
130
131 Originally when discussing tmpfs for the tempfiles and looking at the
132 purchasing of more ram, we said 2gb would be just enough, but thanks to
133 MTD+Reiserfs, it looks like 1gb will do very nicely, 2gb if we think we'll be
134 on lark for more than another 2 years (we could always add more memory later
135 when we get to that point).
136
137 --
138 Robin Hugh Johnson
139 E-Mail : robbat2@××××××××××××××.net
140 Home Page : http://www.orbis-terrarum.net/?l=people.robbat2
141 ICQ# : 30269588 or 41961639
142 GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85
143
144
145
146 ----- End forwarded message -----

Replies