1 |
On Wed, May 28, 2014 at 11:26 AM, Bob Sanders <rsanders@×××.com> wrote: |
2 |
> Marc Joliet, mused, then expounded: |
3 |
>> Am Tue, 27 May 2014 15:39:38 -0700 |
4 |
>> schrieb Bob Sanders <rsanders@×××.com>: |
5 |
>> |
6 |
>> While I am far from a filesystem/storage expert (I see myself as a mere user), |
7 |
>> the cited threads lead me to believe that this is most likely an |
8 |
>> overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would |
9 |
>> suggest reading them in their entirety. |
10 |
>> |
11 |
>> [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832 |
12 |
>> [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871 |
13 |
>> [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877 |
14 |
>> [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821 |
15 |
>> |
16 |
> |
17 |
> FWIW - here's the FreeNAS ZFS ECC discussion on what happens with a bad |
18 |
> memory bit and no ECC memory: |
19 |
> |
20 |
> http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/ |
21 |
> |
22 |
|
23 |
I don't think that anybody debates that if you use btrfs/zfs with |
24 |
non-ECC RAM you can potentially lose some of the protection afforded |
25 |
by the checksumming. |
26 |
|
27 |
What I'd question is that this is some concern unique to btrfs/zfs. |
28 |
I'd think the same failure modes would all apply to any other |
29 |
filesystem. |
30 |
|
31 |
So, the message should be that ECC RAM is better than non-ECC RAM, not |
32 |
that those who use non-ECC RAM are better off using ext4 instead of |
33 |
zfs/btrfs. I'd think that any RAM-related issue that would impact |
34 |
zfs/btrfs would affect ext4 just as badly, and with ext4 you're also |
35 |
vulnerable to all the non-RAM-related errors that checksumming was |
36 |
created to solve. |
37 |
|
38 |
If your RAM is bad then all kinds of stuff can go wrong. Ditto for |
39 |
your cache memory in the CPU, logic circuitry in the CPU, your busses, |
40 |
etc. Most systems are not fault-tolerant of these system components |
41 |
and the cost to make them fault-tolerant tends to be fairly high. On |
42 |
the other hand, the good news is that you're far more likely to have |
43 |
problems with data stored on a disk than in RAM, which is probably why |
44 |
we haven't bothered to improve the other components. |
45 |
|
46 |
Rich |