Gentoo Archives: gentoo-server

From: Matthias Bethke <matthias@×××××××.de>
To: gentoo-server@l.g.o
Subject: Re: [gentoo-server] Atrocious NFS performance
Date: Thu, 20 Apr 2006 17:26:20
Message-Id: 20060420172358.GA1403@huxley
In Reply to: Re: [gentoo-server] Atrocious NFS performance by Jeroen Geilman
1 Hi Jeroen,
2
3 To make a long story short (I'll tell it anyway for the archives' sake),
4 the problem seems solved so far. After recompiling the kernel with 100Hz
5 ticks I get a slight increase in latency when doing the same things as
6 yesterday on the server but it feels just as responsive as before.
7 Lacking any benchmarks from before the change, that's the best measure I
8 have. It might have been the firmware upgrade, but I doubt it. If I have
9 to reboot any time soon, maybe I'll do another test to verify it.
10
11 on Wednesday, 2006-04-19 at 20:42:09, you wrote:
12 > >The slowness is the same on SuSE and Gentoo based clients. The previous
13 > >installation handled the same thing without any problems, which I'd
14 > >certainly expect from a dual Xeon @3 GHz with $ GB RAM, a Compaq
15 > >SmartArray 642 U320 hostadpater and some 200 GB in a RAID5, connected
16 > >to the clients via GBit ethernet.
17 > >
18 > RAID-5? Ouch.
19 > RAID-10 offers a much better raw performance; since individual mirrors
20 > are striped, you get at least 4/3 the seek performance of a 4-disk
21
22 Yeah, but also at 2/3 the capacity. I know RAID5 isn't exactly
23 top-notch, but as long as the controller takes care of the checksumming
24 and distribution and the CPU doesn't have to, it's good enough for our
25 site. That's mostly students doing their exercises, webbrowsing, some
26 programming, usually all with small datasets. The biggest databases are
27 about two gigs and the disks write at just above 40 MB/s.
28
29 > >Definitely not good for GBit, but not so bad either considering it will
30 > >have taken half a minute just to open that file. The file is complete
31 > >despite the I/O error but the error is definitely related to the server
32 > >load, it never happens normally (and I get 9-11s for the 100 MB).
33 > >
34 > LoadAvg of over 10 for I/O only ? That is a serious problem.
35 > I repeat, that is a *problem*, not bad performance.
36
37 Huh? No, 9 to 11 seconds, i.e. ~10 MB/s. I don't see a way how this
38 benchmark could possibly bring my load up that much, after all it's just
39 one process on the client and one on the server.
40
41 > Since you say the box has 4GB of RAM, what happens when you do a linear
42 > read of 2 or 3 GB of data, first uncached and then cached ?
43 > That should not be affected by the I/O subsystem at all.
44
45 Writing gives me said 40 MB/s, reading it back (dd to /dev/null in 1 MB
46 chunks) is 32 MB/s uncached (*slower* than writes? Hm. controller
47 caching maybe...) , ~850 MB/s cached.
48
49 > Also, test your network speed by running netperf or ioperf between
50 > client and server.
51 > Get some baseline values for maximum performance first!
52
53 I didn't test it as the only thing I changed was the server software and
54 it was just fine before. And it *is* fine as long as the server disks
55 aren't busy. Theoretically it could be that the Broadcom NIC driver
56 started sucking donkey balls in kernel 2.6, but ssh and stuff are just
57 fine and speedy (~30 MB/s for a single stream of zeroes).
58
59 > And more bla I don't understand about NFS - what about the basics ?
60 > Which versions are the server and client running ?
61 > Since both could run either v2 or v3 and in-kernel or userspace, that's
62 > 4 x 4 = 16 possible combinations right there - and that is assuming they
63 > both run the *same* minor versions of the NFS software.
64
65 It's v3, that's why I snipped the unused v2 portions of nfsstat output.
66 Both server and client are in-kernel---the client could only be
67 userspace via FUSE, right?---and the latest stable versions,
68 nfs-utils-1.0.6-r6, gentoo-sources-2.6.15-r1 on the client and
69 hardened-sources-2.6.14-r7 on the server.
70
71 > >And one parameter I haven't tried to tweak is the IO scheduler. I seem
72 > >to remember a recommendation to use noop for RAID5 as the cylinder
73 > >numbers are completely virtual anyway so the actual head scheduling
74 > >should be left to the controller. Any opinions on this?
75 > >
76 > I have never heard of the I/O scheduler being able to influence or get
77 > data directly from disks.
78 > In fact, as far as I know that is not even possible with IDE or SCSI,
79 > which both have their own abstraction layers.
80 > What you probably mean is the way the scheduler is allowed to interface
81 > with the disk subsystem - which is solely determined by the disk
82 > subsystem itself.
83
84 OK, that was a bit misleading, I meant that even assuming things about
85 the flat file the scheduler sees of the disk like that offsets in the
86 file sort of linearly correspond to cylinders---which is what it does to
87 implement things like the elevator algorithm---are virtually always
88 right for simple drives but may not be for a RAID.
89
90 > I'd recommend reading the specs for the raid controller - twice.
91 > Also dive into the module source if you're up for it - it can reveal a
92 > lot more than just plugging it in and adding disks.
93
94 Ugh...I read the O'Reilly book on Linux device drivers so I know some
95 basics of it (up to kernel 2.4 that is) but I'd rather not touch the
96 200+ KB of the cciss source as my first real project, especially not
97 when my only test hardware is the production server...
98
99 cheers!
100 Matthias
101 --
102 I prefer encrypted and signed messages. KeyID: FAC37665
103 Fingerprint: 8C16 3F0A A6FC DF0D 19B0 8DEF 48D9 1700 FAC3 7665

Replies

Subject Author
Re: [gentoo-server] Atrocious NFS performance Jeroen Geilman <jeroen@××××××.nl>