1 |
On Tue, 23 Dec 2014 22:24:30 -0500, Rich Freeman wrote: |
2 |
|
3 |
> On Tue, Dec 23, 2014 at 4:08 PM, Holger Hoffstätte |
4 |
> <holger.hoffstaette@××××××××××.com> wrote: |
5 |
>> On Tue, 23 Dec 2014 21:54:00 +0100, Stefan G. Weichinger wrote: |
6 |
>> |
7 |
>>> In the other direction: what protects against these errors you |
8 |
>>> mention? |
9 |
>> |
10 |
>> ceph scrub :) |
11 |
>> |
12 |
>> |
13 |
> Are you sure about that? I was under the impression that it just |
14 |
> checked that everything was retrievable. I'm not sure if it compares |
15 |
> all the copies of everything to make sure that they match, and if they |
16 |
> don't match I don't think that it has any way to know which one is |
17 |
> right. I believe an algorithm just picks one as the official version, |
18 |
> and it may or may not be identical to the one that was originally |
19 |
> stored. |
20 |
|
21 |
There's light and deep scrub; the former does what you described, |
22 |
while deep does checksumming. In case of mismatch it should create |
23 |
a quorum. Whether that actually happens and/or works is another |
24 |
matter. ;) |
25 |
|
26 |
Unfortunately a full point-in-time deep scrub and the resulting creation |
27 |
of checksums is more or less economically unviable with growing amounts |
28 |
of data; this really should be incremental. All distributed databases |
29 |
suffer from the same problem, and the better ones eventually adopted the |
30 |
incremental approach. |
31 |
|
32 |
http://ceph.com/docs/master/rados/configuration/osd-config-ref/#scrubbing |
33 |
|
34 |
I know how btrfs scrub works, but it too (and in fact every storge system) |
35 |
suffers from the problem of having to decide which copy is "good"; they |
36 |
all have different points in their timeline where they need to make a |
37 |
decision at which a checksum is considered valid. When we're talking |
38 |
about preventing bitrot, just having another copy is usually enough. |
39 |
|
40 |
On top of that btrfs will at least tell you which file is suspected, |
41 |
thanks to its wonderful backreferences. |
42 |
|
43 |
-h |