Gentoo Archives: gentoo-user

From: Paul Hartman <paul.hartman+gentoo@×××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] hp H222 SAS controller
Date: Sun, 14 Jul 2013 22:36:22
Message-Id: CAEH5T2OTa6g8Q9W8bruCa3agAoQ=CrQOctDKjyLPt=UCccL_QA@mail.gmail.com
In Reply to: Re: [gentoo-user] hp H222 SAS controller by Alan McKinnon
1 On Mon, Jul 8, 2013 at 10:58 AM, Alan McKinnon <alan.mckinnon@×××××.com> wrote:
2 > On 08/07/2013 17:39, Paul Hartman wrote:
3 >> On Thu, Jul 4, 2013 at 9:04 PM, Paul Hartman
4 >> <paul.hartman+gentoo@×××××.com> wrote:
5 >>> ST4000DM000
6 >>
7 >> As a side-note these two Seagate 4TB "Desktop" edition drives I bought
8 >> already, after about than 100 hours of power-on usage, both drives
9 >> have each encountered dozens of unreadable sectors so far. I was able
10 >> to correct them (force reallocation) using hdparm... So it should be
11 >> "fixed", and I'm reading that this is "normal" with newer drives and
12 >> "don't worry about it", but I'm still coming from the time when 1 bad
13 >> sector = red alert, replace the drive ASAP. I guess I will need to
14 >> monitor and see if it gets worse.
15 >>
16 >
17 >
18 > Way back when in the bad old days of drives measured in 100s of megs,
19 > you'd get a few bad sectors now and then, and would have to mark them as
20 > faulty. This didn't bother us then much
21 >
22 > Nowadays we have drives that are 8,000 bigger than that so all other
23 > things being equal we'd expect sectors to fail 8,000 time more (more
24 > being a very fuzzy concept, and I know full well I'm using it loosely :-) )
25 >
26 > Our drives nowadays also have smart firmware, something we had to
27 > introduce when CHS no longer cut it, this lead to sector failures being
28 > somewhat "invisible" leaving us with the happy delusion that drives were
29 > vastly reliable etc etc etc. But you know all this.
30 >
31 > A mere few dozen failures in the first 100 hours is a failure rate of
32 > (Alan whips out the trust sci calculator) 4.8E-6%. Pretty damn
33 > spectacular if you ask me and WELL within probabilities.
34 >
35 > There is likely nothing wrong with your drives. If they are faulty, it's
36 > highly likely a systemic manufacturing fault of the mechanicals (servo
37 > systems, motor bearing etc)
38 >
39 > You do realize that modern hard drives have for the longest time been up
40 > there in the Top X list of Most Reliable Devices Made By Mankind Ever?
41
42 An update: the Seagate drives have both continued to spit more
43 unrecoverable errors and find more and more bad sectors. Including
44 some end-to-end errors indicated as critical "FAILING NOW" status in
45 SMART. From what I have read that error means the drive's internal
46 cache did not match the data written to disk, which seems like a
47 serious flaw. The threshold is 1 which means if it happens at all, the
48 drive should be replaced. It has happened half a dozen times on each
49 disk so far (but not at the exact same time, so I don't think it is a
50 host controller problem -- and other disks on the same controller and
51 cable have had no issues). They have also been disconnecting and
52 resetting randomly, sometimes requiring me to pull the drive and
53 reinsert it into the enclosure to make it reappear. It happens even
54 after I disabled APM, so I know it isn't a spin-down/idle timeout
55 thing. Temperatures are actually very good (low 30's) so they are not
56 overheating.
57
58 I think I will try to trade them in to Seagate for a new pair under
59 warranty replacement. And then probably try to sell the replacements
60 and be rid of them.
61
62 Meanwhile, during that experiment, I bought 2 brand new Western
63 Digital Red 3TB drives last week. No problems in SMART testing or
64 creating LVM/RAID/Filesystems. I have now been running the destructive
65 write/read badblocks tests for 24+ hours and they have been perfect so
66 far, exactly 0 errors. They are more expensive (3TB for the same price
67 as the 4TB seagate) and slightly slower read/write speed (150MB/sec
68 peak vs 170MB/sec peak), but I value reliability over all other
69 factors.
70
71 These Seagate drives must have some kind of manufacturing defect, or
72 perhaps were damaged in shipping... UPS have been known to treat
73 packages like a football!

Replies

Subject Author
Re: [gentoo-user] hp H222 SAS controller Mick <michaelkintzios@×××××.com>