1 |
On Fri, Jun 21, 2013 at 10:57 AM, Rich Freeman <rich0@g.o> wrote: |
2 |
> On Fri, Jun 21, 2013 at 1:40 PM, Mark Knecht <markknecht@×××××.com> wrote: |
3 |
>> One place where I wanted to double check your thinking. My thought |
4 |
>> is that a RAID1 will _NEVER_ outperform the hdparm -tT read speeds as |
5 |
>> it has to read from three drives and make sure they are all good |
6 |
>> before returning data to the user. |
7 |
> |
8 |
> That isn't correct. In theory it could be done that way, but every |
9 |
> raid1 implementation I've heard of makes writes to all drives |
10 |
> (obviously), but reads from only a single drive (assuming it is |
11 |
> correct). That means that read latency is greatly reduced since they |
12 |
> can be split across two drives which effectively means two heads per |
13 |
> "platter." Also, raid1 typically does not include checksumming, so if |
14 |
> there is a discrepancy between the drives there is no way to know |
15 |
> which one is right. With raid5 at least you can always correct |
16 |
> discrepancies if you have all the disks (though as Duncan pointed out |
17 |
> in practice this only happens if you do an explicit scrub on mdadm). |
18 |
> With btrfs every block is checksummed and so as long as there is one |
19 |
> good (err, consistent) copy somewhere it will be used. |
20 |
> |
21 |
> Rich |
22 |
> |
23 |
|
24 |
Humm... |
25 |
|
26 |
OK, we agree on RAID1 writes. All data must be written to all drives |
27 |
so there's no way to implement any real speed up in that area. If I |
28 |
simplistically assume that write speeds are similar to hdparm -tT read |
29 |
speeds then that's that. |
30 |
|
31 |
On the read side I'm not sure if I'm understanding your point. I agree |
32 |
that a so-designed RAID1 system could/might read smaller portions of a |
33 |
larger read from RAID1 drives in parallel, taking some data from one |
34 |
drive and some from another drive, and then only take action |
35 |
corrective if one of the drives had troubles. However I don't know |
36 |
that mdadm-based RAID1 does anything like that. Does it? |
37 |
|
38 |
It seems to me that unless I at least _request_ all data from all |
39 |
drives and minimally compare at least some error flag from the |
40 |
controller telling me one drive had trouble reading a sector then how |
41 |
do I know if anything bad is happening? |
42 |
|
43 |
Or maybe you're saying it's RAID1 and I don't know if anything bad is |
44 |
happening _unless_ I do a scrub and specifically check all the drives |
45 |
for consistency? |
46 |
|
47 |
Just trying to get clear what you're saying. |
48 |
|
49 |
I do mdadm scrubs at least once a week. I still do them by hand. They |
50 |
have never appeared terribly expensive watching top or iotop but |
51 |
sometimes when I'm watching NetFlix or Hulu in a VM I get more pauses |
52 |
when the scrub is taking place, but it's not huge. |
53 |
|
54 |
I agree that RAID5 gives you an opportunity to get things fixed, but |
55 |
there are folks who lose a disk in a RAID5, start the rebuild, and |
56 |
then lose a second disk during the rebuild. That was my main reason to |
57 |
go to RAID6. Not that I would ever run the array degraded but that I |
58 |
could still tolerate a second loss while the rebuild was happening and |
59 |
hopefully get by. That was similar to my old 3-disk RAID1 where I'd |
60 |
have to lose all 3 disks to be out of business. |
61 |
|
62 |
Thanks, |
63 |
Mark |