1 |
On Tue, 8 Jan 2013 19:53:41 +0000 (UTC) |
2 |
Grant Edwards <grant.b.edwards@×××××.com> wrote: |
3 |
|
4 |
> On 2013-01-08, Pandu Poluan <pandu@××××××.info> wrote: |
5 |
> > On Jan 8, 2013 11:20 PM, "Florian Philipp" <lists@×××××××××××.net> |
6 |
> > wrote: |
7 |
> >> |
8 |
> > |
9 |
> > -- snip -- |
10 |
> > |
11 |
> >> |
12 |
> >> Hmm, good idea, albeit similar to the `md5sum -c`. Either tool |
13 |
> >> leaves you with the problem of distinguishing between legitimate |
14 |
> >> changes (i.e. a user wrote to the file) and decay. |
15 |
> >> |
16 |
> >> When you have completely static content, md5sum, rsync and friends |
17 |
> >> are sufficient. But if you have content that changes from time to |
18 |
> >> time, the number of false-positives would be too high. In this |
19 |
> >> case, I think you could easily distinguish by comparing both file |
20 |
> >> content and time stamps. |
21 |
> >> |
22 |
> >> Now, that of course introduces the problem that decay could occur |
23 |
> >> in the same time frame as a legitimate change, thus masking the |
24 |
> >> decay. To reduce this risk, you have to reduce the checking |
25 |
> >> interval. |
26 |
> >> |
27 |
> >> Regards, |
28 |
> >> Florian Philipp |
29 |
> > |
30 |
> > IMO, we're all barking up the wrong tree here... |
31 |
> > |
32 |
> > Before a file's content can change without user involvement, bit |
33 |
> > rot must first get through the checksum (CRC?) of the hard disk |
34 |
> > itself. There will be no 'gradual degradation of data', just |
35 |
> > 'catastrophic data loss'. |
36 |
> |
37 |
> When a hard drive starts to fail, you don't unknowingly get back |
38 |
> "rotten" data with some bits flipped. You get either a "seek error" |
39 |
> or "read error", and no data at all. IIRC, the same is true for |
40 |
> attempts to read a failing CD. |
41 |
|
42 |
I see what Florian is getting at here, and he's perfectly correct. |
43 |
|
44 |
We techie types often like to think our storage is purely binary, the |
45 |
cells are either on or off and they never change unless we |
46 |
deliberately make them change. We think this way because we wrap our |
47 |
storage in layers to make it look that way, in the style of an API. |
48 |
|
49 |
|
50 |
The truth is that our storage is subject to decay. Harddrives are |
51 |
magnetic at heart, and atoms have to align and stay aligned for the |
52 |
drive to work. Floppies are infinitely worse at this, but drives are |
53 |
not immune. Writeable CDs do not have physical pits and lands like |
54 |
factory original discs have, they use chemicals to make reflective and |
55 |
non-reflective spots. The list of points of corruption is long and |
56 |
they all happen after the data has been committed to physical storage. |
57 |
|
58 |
Worse, you only know about the corruption by reading it, there is no |
59 |
other way to discover if the medium and the data are still OK. He wants |
60 |
to read the medium occasionally and verify it while the backups are |
61 |
still usable, and not wait for the point of no return - the "read error" |
62 |
from a medium that long since failed. |
63 |
|
64 |
Maybe Florian's data is valuable enough to warrant worth the effort. I |
65 |
know mine isn't, but his might be. |
66 |
|
67 |
|
68 |
-- |
69 |
Alan McKinnon |
70 |
alan.mckinnon@×××××.com |