Gentoo Archives: gentoo-server

From: Kerin Millar <kerframil@×××××.com>
To: gentoo-server@l.g.o
Subject: Re: [gentoo-server] It's amazing what switching the IO scheduler can do.
Date: Sat, 06 Aug 2005 13:55:33
Message-Id: 279fbba405080606533ac4d0e0@mail.gmail.com
In Reply to: Re: [gentoo-server] It's amazing what switching the IO scheduler can do. by William Kenworthy
1 On 8/6/05, William Kenworthy <billk@×××××××××.au> wrote:
2 > Hi, is there any guide as to what scheduler is best for what purposes?
3 > I have used CFQ since the early days as I was rather unhappy with the
4 > performance of 2.6 as against 2.4, but I am finding the latest 2.6
5 > kernels do not seem to be running as well as the earlier ones with this
6 > scheduler - so I presume there have have been changes/improvements etc
7 > that may mean I need to reassess what I am using.
8
9 Firstly, if you are wish to fairly compare 2.6 to 2.4 on a
10 "like-for-like" basis then configure the 2.6 kernel to run at 100HZ
11 and boot it with the "elevator=deadline" option. Secondly, it is
12 superfluous to base any performance notions upon experiences with
13 earlier versions of 2.6. I would definitely recommend that you
14 reassess the situation if needs be.
15
16 The topic of elevators/schedulers has been discussed in great (and
17 revealing) detail in various posts over at kerneltrap.org in the past.
18 The most informative post that I can recall was posted by someone who
19 unfortunately elected to reamain anonymous so I cannot credit him by
20 name. I daresay he would not object to my quoting him in full so
21 that's what I'm going to do:
22
23 <quote="Anonymous on Monday, September 20, 2004 - 09:31">
24 This is just an executive summary. I don't know the details.
25 Normally a disk scheduler tries to balance between these goals:
26
27 - fairness (let every process have its share of the access to disk)
28 - performance (try to serve requests close to current disk head
29 position first, because seeking there is fastest)
30 - realtime (guarantee that a request is serviced in a given time)
31
32 A scheduler gets as input the list of sectors that processes have
33 recently requested and which haven't yet given. It knows where the
34 disk head currently is, and whatever other data it "remembers" from
35 the past requests.
36
37 A typical scheduler is called cyclic elevator, where disk heads move
38 from beginning of disk to the end of disk in one direction. Assume
39 that you get some requests in a sorted queue, and you're going to
40 satisfy them. So, you'll first filter the list of requests by deciding
41 that you will not satisfy requests for sectors below your current head
42 position. Then you'll simply go to the sector closest to your current
43 position in your sorted list. So, crunch crunch crunch, you move from
44 start of disk to the end of disk. When you reach the highest
45 unsatisfied sector number, your cycle is over and you seek to the
46 lowest unsatisfied sector. Rinse and repeat.
47
48 Linux 2.2 at least used the cyclic elevator if I remember right.
49
50 So, with these approximate goals in mind, we can look at the different
51 schedulers available.
52
53 - deadline scheduler: this is a cyclic elevator but with a twist:
54 requests are given a deadline by which they get served. When a request
55 starts to look like it's going to expire, the kernel will skip
56 intermediate sectors and move to that request, thus giving some
57 realtime behaviour.
58
59 - AS scheduler: a cyclic elevator with waiting policy: after you
60 service a request that looks like it might have future requests coming
61 nearby, you pause even if there's more sectors in your work queue. The
62 anticipatory scheduler literally anticipates more requests to follow
63 on this track or very close by. How AS decides whether to anticipate
64 is basically just lot of guesswork based on typical access patterns.
65
66 - cfq scheduler: different sort of stab at fairness. It's different
67 from either of these two, and doesn't use cyclic elevator and has
68 realtime guarantees and aggressively avoids starvation. It could be a
69 good scheduler for multiuser systems.
70
71 - noop scheduler: just service next request in the queue without any
72 algorithm to prefer this or that request.
73
74 You want to use noop scheduler on devices where there are no seeking
75 penalty, such as flash drives. That's why USB stick wants noop.
76 Unfortunately, harddisks are very mechanial beasts and their
77 performance is highly controlled by their seeking abilities. All these
78 schedulers above are really trying to figure out how to extract
79 maximum performance off the harddisk without causing bad behaviour in
80 other cases.
81 </quote>
82
83 Note that if you're using a sophisticated hardware RAID device then
84 "noop" may also be a good choice. This is because there may only be
85 one logical block device presented to Linux and it doesn't make sense
86 for a fancy elevator to spend its time reordering requests according
87 to a model of access that is irrelevant to the underlying media (which
88 is abastracted from the kernel).
89
90 The explanation for CFQ is a little scanty. Another anonymous poster
91 at the same site summarised it rather well:
92
93 <quote="Anonymous on Wednesday, February 26, 2003 - 15:38">
94 CFQ aims to provide fairness among competing tasks in that when more
95 than one task requests disk access, the I/O scheduler round-robins
96 among all requestors rather than letting one requestor dominate. It
97 can result in seek storm, however, when the various tasks want to
98 access different areas of the disk. Thus, it can be bad for total
99 throughput. (This is unlike the network stack, where the cost of
100 switching between two unrelated streams is zero or nearly zero.)
101
102 AS seeks to improve throughput by reducing the seek storms that occur
103 when you switch among requestors. It observes that most small reads
104 are so-called dependent reads, and so even though a given requestor
105 might have only one request queued at the moment, it may have several
106 others for areas nearby on the disk. Thus, when it picks one of these
107 reads to schedule, it decides whether to hold off scheduling anything
108 else for that same disk for "a moment" to see if more read requests
109 arrive. The net effect is to reduce the total number of seeks that
110 result between competing streams by allowing greater batches of nearby
111 reads to execute together. Overall throughput should go up as a
112 result.
113
114 CFQ should be better for low-latency tasks *if* the resultant seek
115 storms don't kill throughput. If you have two streams duking it out
116 and no hysteresis to keep you from hopping queues continuously, then
117 the resulting seeks could cause CFQ to leave you crying. In the
118 dependent read case, since the requestor's queue will go empty (and
119 cause a queue switch) after each request, it seems as though these
120 seek storms are practically a given without aggressive readahead.
121 </quote>
122
123 Personally, I'm using the deadline elevator with an md (software)
124 RAID5 setup in conjunction with LVM2.
125
126 Here are my sources for the above quotations:
127
128 http://kerneltrap.org/node/3851
129 http://kerneltrap.org/node/592/2516
130
131 Cheers,
132
133 --Kerin Millar
134
135 --
136 gentoo-server@g.o mailing list