1 |
On 8/6/05, William Kenworthy <billk@×××××××××.au> wrote: |
2 |
> Hi, is there any guide as to what scheduler is best for what purposes? |
3 |
> I have used CFQ since the early days as I was rather unhappy with the |
4 |
> performance of 2.6 as against 2.4, but I am finding the latest 2.6 |
5 |
> kernels do not seem to be running as well as the earlier ones with this |
6 |
> scheduler - so I presume there have have been changes/improvements etc |
7 |
> that may mean I need to reassess what I am using. |
8 |
|
9 |
Firstly, if you are wish to fairly compare 2.6 to 2.4 on a |
10 |
"like-for-like" basis then configure the 2.6 kernel to run at 100HZ |
11 |
and boot it with the "elevator=deadline" option. Secondly, it is |
12 |
superfluous to base any performance notions upon experiences with |
13 |
earlier versions of 2.6. I would definitely recommend that you |
14 |
reassess the situation if needs be. |
15 |
|
16 |
The topic of elevators/schedulers has been discussed in great (and |
17 |
revealing) detail in various posts over at kerneltrap.org in the past. |
18 |
The most informative post that I can recall was posted by someone who |
19 |
unfortunately elected to reamain anonymous so I cannot credit him by |
20 |
name. I daresay he would not object to my quoting him in full so |
21 |
that's what I'm going to do: |
22 |
|
23 |
<quote="Anonymous on Monday, September 20, 2004 - 09:31"> |
24 |
This is just an executive summary. I don't know the details. |
25 |
Normally a disk scheduler tries to balance between these goals: |
26 |
|
27 |
- fairness (let every process have its share of the access to disk) |
28 |
- performance (try to serve requests close to current disk head |
29 |
position first, because seeking there is fastest) |
30 |
- realtime (guarantee that a request is serviced in a given time) |
31 |
|
32 |
A scheduler gets as input the list of sectors that processes have |
33 |
recently requested and which haven't yet given. It knows where the |
34 |
disk head currently is, and whatever other data it "remembers" from |
35 |
the past requests. |
36 |
|
37 |
A typical scheduler is called cyclic elevator, where disk heads move |
38 |
from beginning of disk to the end of disk in one direction. Assume |
39 |
that you get some requests in a sorted queue, and you're going to |
40 |
satisfy them. So, you'll first filter the list of requests by deciding |
41 |
that you will not satisfy requests for sectors below your current head |
42 |
position. Then you'll simply go to the sector closest to your current |
43 |
position in your sorted list. So, crunch crunch crunch, you move from |
44 |
start of disk to the end of disk. When you reach the highest |
45 |
unsatisfied sector number, your cycle is over and you seek to the |
46 |
lowest unsatisfied sector. Rinse and repeat. |
47 |
|
48 |
Linux 2.2 at least used the cyclic elevator if I remember right. |
49 |
|
50 |
So, with these approximate goals in mind, we can look at the different |
51 |
schedulers available. |
52 |
|
53 |
- deadline scheduler: this is a cyclic elevator but with a twist: |
54 |
requests are given a deadline by which they get served. When a request |
55 |
starts to look like it's going to expire, the kernel will skip |
56 |
intermediate sectors and move to that request, thus giving some |
57 |
realtime behaviour. |
58 |
|
59 |
- AS scheduler: a cyclic elevator with waiting policy: after you |
60 |
service a request that looks like it might have future requests coming |
61 |
nearby, you pause even if there's more sectors in your work queue. The |
62 |
anticipatory scheduler literally anticipates more requests to follow |
63 |
on this track or very close by. How AS decides whether to anticipate |
64 |
is basically just lot of guesswork based on typical access patterns. |
65 |
|
66 |
- cfq scheduler: different sort of stab at fairness. It's different |
67 |
from either of these two, and doesn't use cyclic elevator and has |
68 |
realtime guarantees and aggressively avoids starvation. It could be a |
69 |
good scheduler for multiuser systems. |
70 |
|
71 |
- noop scheduler: just service next request in the queue without any |
72 |
algorithm to prefer this or that request. |
73 |
|
74 |
You want to use noop scheduler on devices where there are no seeking |
75 |
penalty, such as flash drives. That's why USB stick wants noop. |
76 |
Unfortunately, harddisks are very mechanial beasts and their |
77 |
performance is highly controlled by their seeking abilities. All these |
78 |
schedulers above are really trying to figure out how to extract |
79 |
maximum performance off the harddisk without causing bad behaviour in |
80 |
other cases. |
81 |
</quote> |
82 |
|
83 |
Note that if you're using a sophisticated hardware RAID device then |
84 |
"noop" may also be a good choice. This is because there may only be |
85 |
one logical block device presented to Linux and it doesn't make sense |
86 |
for a fancy elevator to spend its time reordering requests according |
87 |
to a model of access that is irrelevant to the underlying media (which |
88 |
is abastracted from the kernel). |
89 |
|
90 |
The explanation for CFQ is a little scanty. Another anonymous poster |
91 |
at the same site summarised it rather well: |
92 |
|
93 |
<quote="Anonymous on Wednesday, February 26, 2003 - 15:38"> |
94 |
CFQ aims to provide fairness among competing tasks in that when more |
95 |
than one task requests disk access, the I/O scheduler round-robins |
96 |
among all requestors rather than letting one requestor dominate. It |
97 |
can result in seek storm, however, when the various tasks want to |
98 |
access different areas of the disk. Thus, it can be bad for total |
99 |
throughput. (This is unlike the network stack, where the cost of |
100 |
switching between two unrelated streams is zero or nearly zero.) |
101 |
|
102 |
AS seeks to improve throughput by reducing the seek storms that occur |
103 |
when you switch among requestors. It observes that most small reads |
104 |
are so-called dependent reads, and so even though a given requestor |
105 |
might have only one request queued at the moment, it may have several |
106 |
others for areas nearby on the disk. Thus, when it picks one of these |
107 |
reads to schedule, it decides whether to hold off scheduling anything |
108 |
else for that same disk for "a moment" to see if more read requests |
109 |
arrive. The net effect is to reduce the total number of seeks that |
110 |
result between competing streams by allowing greater batches of nearby |
111 |
reads to execute together. Overall throughput should go up as a |
112 |
result. |
113 |
|
114 |
CFQ should be better for low-latency tasks *if* the resultant seek |
115 |
storms don't kill throughput. If you have two streams duking it out |
116 |
and no hysteresis to keep you from hopping queues continuously, then |
117 |
the resulting seeks could cause CFQ to leave you crying. In the |
118 |
dependent read case, since the requestor's queue will go empty (and |
119 |
cause a queue switch) after each request, it seems as though these |
120 |
seek storms are practically a given without aggressive readahead. |
121 |
</quote> |
122 |
|
123 |
Personally, I'm using the deadline elevator with an md (software) |
124 |
RAID5 setup in conjunction with LVM2. |
125 |
|
126 |
Here are my sources for the above quotations: |
127 |
|
128 |
http://kerneltrap.org/node/3851 |
129 |
http://kerneltrap.org/node/592/2516 |
130 |
|
131 |
Cheers, |
132 |
|
133 |
--Kerin Millar |
134 |
|
135 |
-- |
136 |
gentoo-server@g.o mailing list |