1 |
All -- |
2 |
|
3 |
We haven't yet discussed how we're going to spend money (or decide how to |
4 |
spend money) but below is a message from Robin that I think clearly |
5 |
justifies spending some money on lark. It might be a good catalyst to get |
6 |
us to figure out what our budgeting and expenditure approval process is |
7 |
going to be. It can be as simple as someone (like me) posting an email |
8 |
(like this) to this list or it can be somethimg more complex. |
9 |
|
10 |
Ideas? |
11 |
|
12 |
--kurt |
13 |
|
14 |
----- Forwarded message from "Robin H. Johnson" <robbat2@g.o> ----- |
15 |
|
16 |
Date: Sat, 22 May 2004 02:55:28 -0700 |
17 |
From: "Robin H. Johnson" <robbat2@g.o> |
18 |
Subject: CVS performance improvements |
19 |
To: Gentoo Infrastructure <gentoo-infrastructure@l.g.o> |
20 |
|
21 |
Some weeks ago, in #-infra we had a discussion about improving the |
22 |
performance of the CVS server. |
23 |
|
24 |
Some months ago we already moved the CVS lockfiles to a tmpfs which provided a |
25 |
very good speedup for many operations, but still left room for improvement. |
26 |
Before the lockfile move, a full cvs up could take more than 45 minutes if the |
27 |
server was very busy. The lockfile move got us down below 20 minutes the great |
28 |
majority of the time. As the server has become busier, we've recently been |
29 |
hitting 20 minutes as the average time for a cvs up again. |
30 |
|
31 |
Outside of the CVS tree itself, CVS uses two things: |
32 |
lockfiles - (detailed above) |
33 |
tmpfiles - stored in /tmp/cvs*/ by default, CVS figures out what it will |
34 |
send you by building stuff here - a mixture of what you upload to it on |
35 |
your file status and the latest file status. |
36 |
|
37 |
The contents of tmpfiles don't last much longer than the duration of a |
38 |
single checkout, so it's suited for caching in memory, but there is a |
39 |
problem with this. It is a LOT of files. It's usual space allocation is |
40 |
~4 very small files (50-200 bytes ea.) for each directory - approx 1kb |
41 |
of actual data per directory. We have 18812 directories for our |
42 |
gentoo-x86 module presently, so this is a LOT of wasted space when using |
43 |
4kb blocks. Checkouts are presently all on disk, and use up this space |
44 |
already when they occur. ~300mb for a single 'cvs co gentoo-x86'. |
45 |
However due to the large number of files, esp. when we have multiple |
46 |
checkouts/updates going, performance really hurts on /tmp (which is on |
47 |
our system drive). |
48 |
|
49 |
Before when were wondering how to solve this bottleneck, we considered added |
50 |
more ram and using tmpfs. tmpfs is locked at using 4kb blocks due to how it's |
51 |
designed, so we'd need huge amounts of memory - we frequently have 5-8 |
52 |
simultaneous large updates going (5*300mb = 1.5gb). |
53 |
|
54 |
As a quick profiling of CVS usage, I processed the some logs for the last 14 |
55 |
days, and excluded the gmirror, gweb and pylon users as they look to be using |
56 |
scripts extremely and throwing off some statistics. |
57 |
|
58 |
Top 5 users: |
59 |
Total Runs | Username |
60 |
--------------------- |
61 |
1350 | gweb |
62 |
1382 | vapier |
63 |
1549 | mr_bones_ |
64 |
1997 | gmirror |
65 |
2765 | pylon |
66 |
|
67 |
UTC | Average CVS |
68 |
Hour | runs per hour |
69 |
--------------------- |
70 |
00h | 72.571 |
71 |
01h | 62.357 |
72 |
02h | 53.786 |
73 |
03h | 245.786 |
74 |
04h | 51.357 |
75 |
05h | 63.857 |
76 |
06h | 43.214 |
77 |
07h | 56.786 |
78 |
08h | 47.786 |
79 |
09h | 46.714 |
80 |
10h | 42.786 |
81 |
11h | 50.643 |
82 |
12h | 62.714 |
83 |
13h | 58.643 |
84 |
14h | 73.000 |
85 |
15h | 77.357 |
86 |
16h | 69.500 |
87 |
17h | 82.571 |
88 |
18h | 117.714 |
89 |
19h | 66.643 |
90 |
20h | 75.857 |
91 |
21h | 65.143 |
92 |
22h | 66.143 |
93 |
23h | 77.857 |
94 |
|
95 |
In the above numbers, ~200 of the daily items are from pylon all around 3am. |
96 |
gweb and gmirror together account for 10 runs / hour. Without pylon, gweb, |
97 |
gmirror, we have ~1300 developer-initiated CVS runs each day. (I don't see any |
98 |
other obvious script accesses). Of that, some 20% are cvs updates. |
99 |
|
100 |
Looking to cut down on the wasted tail space, I ran some more tests on lark |
101 |
today. I originally wanted to use ramdisk as a module (so I can specify the |
102 |
size to 128mb), but this isn't possible as we have ramdisk compiled in for |
103 |
initrd. Instead, I've utilized some of the MTD support in the kernel (all |
104 |
added as modules), which provides a similar functionality to a ramdisk. |
105 |
I loaded the MTD as a 64mb chunk in memory, and formatted with reiserfs, so I |
106 |
could try out tail-packing for space saving. |
107 |
|
108 |
A full 'cvs co gentoo-x86' needs 20mb of space with. An a full update of your |
109 |
tree if you are very out of date (a month or more) could use ~50mb of space. |
110 |
A single cvs checkin won't take more than 100kb absolute worst case. |
111 |
|
112 |
I believe the MTD+reiserfs route is definetly much more cost-effective on RAM |
113 |
than tmpfs when the system is busy, but it has the disadvantage that when the |
114 |
system isn't busy, the kernel would have reclaimed the tmpfs space, but can't |
115 |
do so with the MTD space. |
116 |
|
117 |
Results from a testing run of MTD+reiserfs running on my user only: |
118 |
(did 3 cvs up on month-old repos, and 2 full cvs co) |
119 |
- 5-8 minutes for cvs up/co !! This is a two to three times faster than before, |
120 |
and it won't increase significently in future, or with large numbers of users, |
121 |
as memory has constant access time :-). |
122 |
|
123 |
For this testing, I had 8 other developers doing either cvs up or cvs co at the |
124 |
same time, and the i/o usage pushed loadavg to ~15 on lark. |
125 |
|
126 |
There is one remaing problem, namely I can't get MTD to provide me with a block |
127 |
larger than 64mb, or multiple blocks that span beyond 64mb. I'll play with some |
128 |
of the MTD stuff at work this week (on a box with lots of memory), and see what |
129 |
is needed to use lots of memory with MTD. |
130 |
|
131 |
Originally when discussing tmpfs for the tempfiles and looking at the |
132 |
purchasing of more ram, we said 2gb would be just enough, but thanks to |
133 |
MTD+Reiserfs, it looks like 1gb will do very nicely, 2gb if we think we'll be |
134 |
on lark for more than another 2 years (we could always add more memory later |
135 |
when we get to that point). |
136 |
|
137 |
-- |
138 |
Robin Hugh Johnson |
139 |
E-Mail : robbat2@××××××××××××××.net |
140 |
Home Page : http://www.orbis-terrarum.net/?l=people.robbat2 |
141 |
ICQ# : 30269588 or 41961639 |
142 |
GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 |
143 |
|
144 |
|
145 |
|
146 |
----- End forwarded message ----- |