1 |
Am 08.01.2013 18:41, schrieb Pandu Poluan: |
2 |
> |
3 |
> On Jan 8, 2013 11:20 PM, "Florian Philipp" <lists@×××××××××××.net |
4 |
> <mailto:lists@×××××××××××.net>> wrote: |
5 |
>> |
6 |
> |
7 |
> -- snip -- |
8 |
> |
9 |
[...] |
10 |
>> |
11 |
>> When you have completely static content, md5sum, rsync and friends are |
12 |
>> sufficient. But if you have content that changes from time to time, the |
13 |
>> number of false-positives would be too high. In this case, I think you |
14 |
>> could easily distinguish by comparing both file content and time stamps. |
15 |
>> |
16 |
[...] |
17 |
> |
18 |
> IMO, we're all barking up the wrong tree here... |
19 |
> |
20 |
> Before a file's content can change without user involvement, bit rot |
21 |
> must first get through the checksum (CRC?) of the hard disk itself. |
22 |
> There will be no 'gradual degradation of data', just 'catastrophic data |
23 |
> loss'. |
24 |
> |
25 |
|
26 |
Unfortunately, that's only partly true. Latent disk errors are a well |
27 |
researched topic [1-3]. CRCs are not perfectly reliable. The trick is to |
28 |
detect and correct errors while you still have valid backups or other |
29 |
types of redundancy. |
30 |
|
31 |
The only way to do this is regular scrubbing. That's why professional |
32 |
archival solutions offer some kind of self-healing which is usually just |
33 |
the same as what I proposed (plus whatever on-access integrity checks |
34 |
the platform supports) [4]. |
35 |
|
36 |
> I would rather focus my efforts on ensuring that my backups are always |
37 |
> restorable, at least until the most recent time of archival. |
38 |
> |
39 |
|
40 |
That's the point: |
41 |
a) You have to detect when you have to restore from backup. |
42 |
b) You have to verify that the backup itself is still valid. |
43 |
c) You have to avoid situations where undetected errors creep into the |
44 |
backup. |
45 |
|
46 |
I'm not talking about a purely theoretical possibility. I have |
47 |
experienced just that: Some data that I have kept lying around for years |
48 |
was corrupted. |
49 |
|
50 |
[1] Schwarz et.al: Disk Scrubbing in Large, Archival Storage Systems |
51 |
http://www.cse.scu.edu/~tschwarz/Papers/mascots04.pdf |
52 |
|
53 |
[2] Baker et.al: A fresh look at the reliability of long-term digital |
54 |
storage |
55 |
http://arxiv.org/pdf/cs/0508130 |
56 |
|
57 |
[3] Bairavasundaram et.al: An Analysis of Latent Sector Errors in Disk |
58 |
Drives |
59 |
http://bnrg.eecs.berkeley.edu/~randy/Courses/CS294.F07/11.1.pdf |
60 |
|
61 |
[4] |
62 |
http://uk.emc.com/collateral/analyst-reports/kci-evaluation-of-emc-centera.pdf |
63 |
|
64 |
Regards, |
65 |
Florian Philipp |