Gentoo Archives: gentoo-user

From: Dale <rdalek1967@×××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Hard drive error from SMART
Date: Tue, 12 Apr 2022 17:08:48
Message-Id: 590a0548-6121-f6b5-cac3-3bb9202f3b70@gmail.com
In Reply to: Re: [gentoo-user] Hard drive error from SMART by Rich Freeman
1 Rich Freeman wrote:
2 > On Mon, Apr 11, 2022 at 9:27 PM Dale <rdalek1967@×××××.com> wrote:
3 >> Thoughts. Replace as soon as drive arrives or wait and see?
4 >>
5 > So, first of all just about all my hard drives are in a RAID at this
6 > point, so I have a higher tolerance for issues.
7 >
8 > If a drive is under warranty I'll usually try to see if they will RMA
9 > it. More often than not they will, and in that case there is really
10 > no reason not to. I'll do advance shipping and replace the drive
11 > before sending the old one back so that I mostly have redundancy the
12 > whole time.
13 >
14 > If it isn't under warranty then I'll scrub it and see what happens.
15 > I'll of course do SMART self-tests, but usually an error like this
16 > won't actually clear until you overwrite the offline sector so that
17 > the drive can reallocate it. A RAID scrub/resilver/etc will overwrite
18 > the sector with the correct contents which will allow this to happen.
19 > (Otherwise there is no way for the drive to recover - if it knew what
20 > was stored there it wouldn't have an error in the first place.)
21 >
22 > If an error comes back then I'll replace the drive. My drives are
23 > pretty large at this point so I don't like keeping unreliable drives
24 > around. It just increases the risk of double failures, given that a
25 > large hard drive can take more than a day to replace. Write speeds
26 > just don't keep pace with capacities. I do have offline backups but I
27 > shudder at the thought of how long one of those would take to restore.
28 >
29
30
31 Sadly, I don't have RAID here but to be honest, I really need to have it
32 given the data and my recent luck with hard drives.  Drives used to get
33 dumped because they were just to small to use anymore.  Nowadays, they
34 seem to break in some fashion long before their usefulness ends their
35 lives. 
36
37 I remounted the drives and did a backup.  For anyone running up on this,
38 just in case one of the files got corrupted, I used a little trick to
39 see if I can figure out which one may be bad if any.  I took my rsync
40 commands from my little script and ran them one at a time with --dry-run
41 added.  If a file was to be updated on the backup that I hadn't changed
42 or added, I was going to check into it before updating my backups.  It
43 could be that the backup file was still good and the file on my drive
44 reporting problems was bad.  In that case, I would determine which was
45 good and either restore it from backups or allow it to be updated if
46 needed.  Either way, I should have a good file since the drive claims to
47 have fixed the problem.  Now let us pray.  :-D 
48
49 Drive isn't under warranty.  I may have to start buying new drives from
50 dealers.  Sometimes I find drives that are pulled from systems and have
51 very few hours on them.  Still, warranty may not last long.  Saves a lot
52 of money tho. 
53
54 USPS claims drive is on the way.  Left a distribution point and should
55 update again when it gets close.  First said Saturday, then said
56 Friday.  I think Friday is about right but if the wind blows right,
57 maybe Thursday. 
58
59 I hope I have another port and power cable plug for the swap out.  At
60 least now, I can unmount it and swap without a lot of rebooting.  Since
61 it's on LVM, that part is easy.  Regretfully I have experience on that
62 process.  :/
63
64 Thanks to all. 
65
66 Dale
67
68 :-)  :-) 

Replies

Subject Author
RE: [gentoo-user] Hard drive error from SMART Laurence Perkins <lperkins@×××××××.net>
Re: [gentoo-user] Hard drive error from SMART Frank Steinmetzger <Warp_7@×××.de>
Re: [gentoo-user] Hard drive error from SMART Rich Freeman <rich0@g.o>