Gentoo Archives: gentoo-user

From: Dale <rdalek1967@×××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] smartctrl drive error @60%
Date: Wed, 25 Jun 2014 13:16:08
Message-Id: 53AACB8D.6010300@gmail.com
In Reply to: Re: [gentoo-user] smartctrl drive error @60% by thegeezer
1 thegeezer wrote:
2 > On 06/25/2014 08:49 AM, Dale wrote:
3 >> thegeezer wrote:
4 >>> this is pretty bad.
5 >> Here is the output:
6 >>
7 >> root@fireball / # smartctl -a /dev/sdc
8 >> smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build)
9 >> Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
10 >>
11 >> === START OF INFORMATION SECTION ===
12 >> Model Family: Seagate Barracuda 7200.14 (AF)
13 >> Device Model: ST3000DM001-9YN166
14 >> Serial Number: Z1F0PKT5
15 >> LU WWN Device Id: 5 000c50 04d79e15c
16 >> Firmware Version: CC4C
17 >> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
18 >> Sector Sizes: 512 bytes logical, 4096 bytes physical
19 >> Rotation Rate: 7200 rpm
20 >> Device is: In smartctl database [for details use: -P show]
21 >> ATA Version is: ATA8-ACS T13/1699-D revision 4
22 >> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
23 >> Local Time is: Wed Jun 25 02:46:39 2014 CDT
24 >>
25 >> ==> WARNING: A firmware update for this drive is available,
26 >> see the following Seagate web pages:
27 >> http://knowledge.seagate.com/articles/en_US/FAQ/207931en
28 >> http://knowledge.seagate.com/articles/en_US/FAQ/223651en
29 > interesting - not seen that before might be worth a nose
30
31 I was thinking the same thing myself. How does it know there is a
32 update was another question I had.
33
34 >> SMART support is: Available - device has SMART capability.
35 >> SMART support is: Enabled
36 >>
37 >> === START OF READ SMART DATA SECTION ===
38 >> SMART overall-health self-assessment test result: PASSED
39 >>
40 >> General SMART Values:
41 >> Offline data collection status: (0x00) Offline data collection activity
42 >> was never started.
43 >> Auto Offline Data Collection:
44 >> Disabled.
45 >> Self-test execution status: ( 118) The previous self-test completed
46 >> having
47 >> the read element of the test failed.
48 >> Total time to complete Offline
49 >> data collection: ( 584) seconds.
50 >> Offline data collection
51 >> capabilities: (0x73) SMART execute Offline immediate.
52 >> Auto Offline data collection
53 >> on/off support.
54 >> Suspend Offline collection upon new
55 >> command.
56 >> No Offline surface scan supported.
57 >> Self-test supported.
58 >> Conveyance Self-test supported.
59 >> Selective Self-test supported.
60 >> SMART capabilities: (0x0003) Saves SMART data before entering
61 >> power-saving mode.
62 >> Supports SMART auto save timer.
63 >> Error logging capability: (0x01) Error logging supported.
64 >> General Purpose Logging supported.
65 >> Short self-test routine
66 >> recommended polling time: ( 1) minutes.
67 >> Extended self-test routine
68 >> recommended polling time: ( 340) minutes.
69 >> Conveyance self-test routine
70 >> recommended polling time: ( 2) minutes.
71 >> SCT capabilities: (0x3085) SCT Status supported.
72 >>
73 >> SMART Attributes Data Structure revision number: 10
74 >> Vendor Specific SMART Attributes with Thresholds:
75 >> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
76 >> UPDATED WHEN_FAILED RAW_VALUE
77 >> 1 Raw_Read_Error_Rate 0x000f 119 099 006 Pre-fail
78 >> Always - 234421760
79 > you can happily ignore this error rate, it is usual for it to be high
80 > and htere is hardware correction for it
81 >
82 >> 3 Spin_Up_Time 0x0003 092 092 000 Pre-fail
83 >> Always - 0
84 >> 4 Start_Stop_Count 0x0032 100 100 020 Old_age
85 >> Always - 33
86 > 33 power cycles seem very low but further down we see the power on time
87 > is just under two years which is also erring towards the lighter side of
88 > the mtbf
89
90 About the only time I shutdown is when the power fails. My puter only
91 pulls about 150 watts so I just leave it running 24/7.
92
93
94 >
95 >> 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
96 >> Always - 0
97 > zero reallocated sectors suggests there is space to do reallocation
98 >
99 >> 7 Seek_Error_Rate 0x000f 079 060 030 Pre-fail
100 >> Always - 99909120
101 >> 9 Power_On_Hours 0x0032 082 082 000 Old_age
102 >> Always - 16379
103 > almost two years of power on time
104 >
105 >> 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
106 >> Always - 0
107 >> 12 Power_Cycle_Count 0x0032 100 100 020 Old_age
108 >> Always - 34
109 >> 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age
110 >> Always - 0
111 >> 184 End-to-End_Error 0x0032 100 100 099 Old_age
112 >> Always - 0
113 >> 187 Reported_Uncorrect 0x0032 100 100 000 Old_age
114 >> Always - 0
115 >> 188 Command_Timeout 0x0032 100 100 000 Old_age
116 >> Always - 0 0 0
117 >> 189 High_Fly_Writes 0x003a 100 100 000 Old_age
118 >> Always - 0
119 >> 190 Airflow_Temperature_Cel 0x0022 069 063 045 Old_age
120 >> Always - 31 (Min/Max 26/33)
121 >> 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age
122 >> Always - 0
123 >> 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age
124 >> Always - 9
125 >> 193 Load_Cycle_Count 0x0032 093 093 000 Old_age
126 >> Always - 14284
127 >> 194 Temperature_Celsius 0x0022 031 040 000 Old_age
128 >> Always - 31 (0 17 0 0 0)
129 >> 197 Current_Pending_Sector 0x0012 100 100 000 Old_age
130 >> Always - 104
131 > 197
132 > this says there are 104 pending sectors i.e. bad blocks on the drive
133 > that have not been reallocatd yet
134
135 Wonder why it hasn't? Isn't it supposed to do that sort of thing itself?
136
137
138 >
139 >> 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
140 >> Offline - 104
141 > this says it was not able to reallocate. which is odd because of the
142 > entry 5 being zero
143
144 Uh oh.
145
146 >
147 >> 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
148 >> Always - 0
149 >> 240 Head_Flying_Hours 0x0000 100 253 000 Old_age
150 >> Offline - 15955h+37m+28.932s
151 >> 241 Total_LBAs_Written 0x0000 100 253 000 Old_age
152 >> Offline - 52221690631887
153 >> 242 Total_LBAs_Read 0x0000 100 253 000 Old_age
154 >> Offline - 74848968465606
155 >>
156 >> SMART Error Log Version: 1
157 >> No Errors Logged
158 >>
159 >> SMART Self-test log structure revision number 1
160 >> Num Test_Description Status Remaining
161 >> LifeTime(hours) LBA_of_first_error
162 >> # 1 Extended offline Completed: read failure 60%
163 >> 16365 2905482560
164 >> # 2 Extended offline Completed: read failure 60%
165 >> 16352 2905482560
166 >> # 3 Extended offline Completed without error 00%
167 >> 8044 -
168 >> # 4 Extended offline Completed without error 00%
169 >> 3121 -
170 >> # 5 Extended offline Completed without error 00%
171 >> 1548 -
172 >> # 6 Short offline Completed without error 00%
173 >> 1141 -
174 >> # 7 Extended offline Completed without error 00%
175 >> 719 -
176 >> # 8 Extended offline Completed without error 00%
177 >> 525 -
178 >> # 9 Short offline Completed without error 00%
179 >> 516 -
180 >> #10 Extended offline Completed without error 00%
181 >> 18 -
182 >> #11 Extended offline Completed without error 00%
183 >> 5 -
184 >> #12 Short offline Completed without error 00%
185 >> 0 -
186 >>
187 >> SMART Selective self-test log data structure revision number 1
188 >> SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
189 >> 1 0 0 Not_testing
190 >> 2 0 0 Not_testing
191 >> 3 0 0 Not_testing
192 >> 4 0 0 Not_testing
193 >> 5 0 0 Not_testing
194 >> Selective self-test flags (0x0):
195 >> After scanning selected spans, do NOT read-scan remainder of disk.
196 >> If Selective self-test is pending on power-up, resume after 0 minute delay.
197 >>
198 >> root@fireball / #
199 >>
200 >> Does that help shed any light on this situation? If you need more
201 >> info, just let me know. Off to newegg. BRB
202 >>
203 >> Dale
204 >>
205 >> :-) :-)
206 >>
207 > 104 bad blocks is not a sign of a happy disk.
208 > i would replace urgently
209 >
210 > also consider running smartd or a smartmonitor plugin for munin as the
211 > test log suggests you last ran a test after the first year of usage
212 >
213 >
214
215 I usually just run the test manually but I sort of had family stuff
216 going on for the past year, almost a year anyway. Sort of behind on
217 things although I have been doing my normal updates.
218
219 I ordered a drive. It should be here tomorrow. In the meantime, I
220 shutdown and re-seated all the cables, power too. I got the test running
221 again but results is a few hours off yet. It did pass the short test
222 tho. I'm not sure that it means much.
223
224 Thanks much.
225
226 Dale
227
228 :-) :-)

Replies

Subject Author
Re: [gentoo-user] smartctrl drive error @60% Rich Freeman <rich0@g.o>
Re: [gentoo-user] smartctrl drive error @60% David Haller <gentoo@×××××××.de>
Re: [gentoo-user] OT: power requirement (WAS: smartctrl drive error @60%) Frank Steinmetzger <Warp_7@×××.de>