Gentoo Archives: gentoo-user

From: Grant Edwards <grant.b.edwards@×××××.com>
To: gentoo-user@l.g.o
Subject: [gentoo-user] Re: OT: Fighting bit rot
Date: Tue, 08 Jan 2013 22:17:30
Message-Id: kci5pj$rt5$1@ger.gmane.org
In Reply to: Re: [gentoo-user] Re: OT: Fighting bit rot by Alan McKinnon
1 On 2013-01-08, Alan McKinnon <alan.mckinnon@×××××.com> wrote:
2
3 >> When a hard drive starts to fail, you don't unknowingly get back
4 >> "rotten" data with some bits flipped. You get either a "seek error"
5 >> or "read error", and no data at all. IIRC, the same is true for
6 >> attempts to read a failing CD.
7 >
8 > I see what Florian is getting at here, and he's perfectly correct.
9 >
10 > We techie types often like to think our storage is purely binary, the
11 > cells are either on or off and they never change unless we
12 > deliberately make them change. We think this way because we wrap our
13 > storage in layers to make it look that way, in the style of an API.
14 >
15 > The truth is that our storage is subject to decay. Harddrives are
16 > magnetic at heart, and atoms have to align and stay aligned for the
17 > drive to work. Floppies are infinitely worse at this, but drives are
18 > not immune. Writeable CDs do not have physical pits and lands like
19 > factory original discs have, they use chemicals to make reflective and
20 > non-reflective spots. The list of points of corruption is long and
21 > they all happen after the data has been committed to physical storage.
22
23 True. But, in my experience, the chances of any of those failures
24 resulting in a successful read of incorrect data is vanishly small.
25
26 > Worse, you only know about the corruption by reading it, there is no
27 > other way to discover if the medium and the data are still OK. He
28 > wants to read the medium occasionally
29
30 That may be a good idea, and will detect media failures.
31
32 > and verify it
33
34 That's the part I think is pointless in practice (if you're trying to
35 detect failing media).
36
37 > while the backups are still usable, and not wait for the point of no
38 > return - the "read error" from a medium that long since failed.
39
40 My point is that _comparing_data_to_a_backup_ just isn't a useful,
41 practical way to detect failing hard drives, optical drives, or CDs.
42 I've seen a lot of hard drives, optical drives, floppy drives,
43 flopies, and CDs fail. The failure mode in every case has been a "seek
44 error" or "read error" resulting in _no_data_ rather than a read
45 returning erroneous data.
46
47 It seems that in laboratory conditions, people have managed to see
48 erroneous data, but I'm not convinced worrying about it is worthwhile.
49
50 IMO, having backup data _is_ very valuable, but regularly reading
51 files and comparing them to backup copies isn't a useful way to detect
52 failing media.
53
54 You're much more likely to detect failing RAM (which is useful, but
55 there are better ways to do it).
56
57 --
58 Grant Edwards grant.b.edwards Yow! I think I am an
59 at overnight sensation right
60 gmail.com now!!

Replies

Subject Author
Re: [gentoo-user] Re: OT: Fighting bit rot Alan McKinnon <alan.mckinnon@×××××.com>