1 |
On Fri, Feb 26, 2010 at 1:46 AM, Alex Schuster <wonko@×××××××××.org> wrote: |
2 |
> Mark Knecht writes: |
3 |
> |
4 |
>> Do I just watch the logs looking for problems? I have no way of |
5 |
>> knowing right now whether this was a disk problem that's going to come |
6 |
>> back, a 1 time deal due to power, or something else entirely. |
7 |
>> |
8 |
>> As these cheap machines that don't use RAID what's the right way to |
9 |
>> go? emerge -e @world and then wait for the next event? Do nothing and |
10 |
>> wait? |
11 |
> |
12 |
> Emerge smartmontools, then: |
13 |
> |
14 |
> smartctl -h /dev/sda # get overview of what the drive thinks about itself |
15 |
> |
16 |
> smartctl -t short /dev/sda # start short self test |
17 |
> Wait |
18 |
> smartctl -l selftest /dev/sda # see results |
19 |
> |
20 |
> smartctl -t long /dev/sda # start long self test |
21 |
> Wait a lot longer |
22 |
> smartctl -l selftest /dev/sda # see results |
23 |
> |
24 |
> You can continue working in the meanwhile, there will be no performance |
25 |
> impact. You will see something like this in the log: |
26 |
> |
27 |
> === START OF READ SMART DATA SECTION === |
28 |
> SMART Self-test log structure revision number 1 |
29 |
> Num Test_Description Status Remaining LifeTime(hours) |
30 |
> LBA_of_first_error |
31 |
> # 1 Short offline Completed without error 00% 2275 - |
32 |
> # 2 Extended offline Completed without error 00% 2270 - |
33 |
> # 3 Extended offline Completed without error 00% 1799 - |
34 |
> # 4 Extended offline Completed without error 00% 197 - |
35 |
> # 5 Extended offline Completed without error 00% 26 - |
36 |
> |
37 |
> I you have a '-' in the right column, the disk has found no errors. If |
38 |
> there is a number, than it's the position of the first error. |
39 |
> |
40 |
> There's also badblocks, this will check every block and output the bad |
41 |
> ones: badblocks -sv /dev/sda |
42 |
> |
43 |
> badblocks -svn /dev/sda will do a read-write test. In case of a bad block, |
44 |
> the drive should exchange it with a spare one. Maybe this happens already |
45 |
> in read-only mode, I am not sure. |
46 |
> |
47 |
> Also watch for errors in syslog or via dmesg, there should be some when |
48 |
> bad blocks are being accessed. |
49 |
> |
50 |
> Wonko |
51 |
> |
52 |
> |
53 |
|
54 |
Hi Wonko, |
55 |
Yes, I do use smartctl on some other machines although I'm not very |
56 |
good about it and your write-up is helpful so thanks for that. |
57 |
|
58 |
My wife's machines is older and and I don't think SMART is |
59 |
supported on her drive. Note the lack of a * on the SMART line in |
60 |
hdparm -I: |
61 |
|
62 |
dragonfly ~ # hdparm -I /dev/hda |
63 |
|
64 |
/dev/hda: |
65 |
|
66 |
ATA device, with non-removable media |
67 |
Model Number: WDC WD1600BB-00FTA0 |
68 |
Serial Number: WD-WMAES2091586 |
69 |
Firmware Revision: 15.05R15 |
70 |
Standards: |
71 |
Supported: 6 5 4 |
72 |
Likely used: 6 |
73 |
Configuration: |
74 |
Logical max current |
75 |
cylinders 16383 16383 |
76 |
heads 16 16 |
77 |
sectors/track 63 63 |
78 |
-- |
79 |
CHS current addressable sectors: 16514064 |
80 |
LBA user addressable sectors: 268435455 |
81 |
LBA48 user addressable sectors: 312581808 |
82 |
Logical/Physical Sector size: 512 bytes |
83 |
device size with M = 1024*1024: 152627 MBytes |
84 |
device size with M = 1000*1000: 160041 MBytes (160 GB) |
85 |
cache/buffer size = 2048 KBytes (type=DualPortCache) |
86 |
Capabilities: |
87 |
LBA, IORDY(can be disabled) |
88 |
Standby timer values: spec'd by Standard, with device specific minimum |
89 |
R/W multiple sector transfer: Max = 16 Current = 16 |
90 |
Recommended acoustic management value: 128, current value: 254 |
91 |
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 |
92 |
Cycle time: min=120ns recommended=120ns |
93 |
PIO: pio0 pio1 pio2 pio3 pio4 |
94 |
Cycle time: no flow control=120ns IORDY flow control=120ns |
95 |
Commands/features: |
96 |
Enabled Supported: |
97 |
SMART feature set |
98 |
Security Mode feature set |
99 |
* Power Management feature set |
100 |
* Write cache |
101 |
* Look-ahead |
102 |
* Host Protected Area feature set |
103 |
* WRITE_BUFFER command |
104 |
* READ_BUFFER command |
105 |
* DOWNLOAD_MICROCODE |
106 |
SET_MAX security extension |
107 |
Automatic Acoustic Management feature set |
108 |
* 48-bit Address feature set |
109 |
* Device Configuration Overlay feature set |
110 |
* Mandatory FLUSH_CACHE |
111 |
* FLUSH_CACHE_EXT |
112 |
* SMART error logging |
113 |
* SMART self-test |
114 |
Security: |
115 |
supported |
116 |
not enabled |
117 |
not locked |
118 |
not frozen |
119 |
not expired: security count |
120 |
not supported: enhanced erase |
121 |
HW reset results: |
122 |
CBLID- above Vih |
123 |
Device num = 0 determined by CSEL |
124 |
Checksum: correct |
125 |
dragonfly ~ # |
126 |
|
127 |
dragonfly ~ # smartctl -H /dev/hda |
128 |
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen |
129 |
Home page is http://smartmontools.sourceforge.net/ |
130 |
|
131 |
SMART Disabled. Use option -s with argument 'on' to enable it. |
132 |
dragonfly ~ # smartctl -s on /dev/hda |
133 |
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen |
134 |
Home page is http://smartmontools.sourceforge.net/ |
135 |
|
136 |
=== START OF ENABLE/DISABLE COMMANDS SECTION === |
137 |
Error SMART Enable failed: Input/output error |
138 |
Smartctl: SMART Enable Failed. |
139 |
|
140 |
A mandatory SMART command failed: exiting. To continue, add one or |
141 |
more '-T permissive' options. |
142 |
dragonfly ~ # |
143 |
|
144 |
I've not tried the -T permissive options. |
145 |
|
146 |
I've never used badblocks as it seems I should only do that off-line. |
147 |
This might be a good time to boot with a CD and try it out. |
148 |
|
149 |
Maybe I should just get a new drive that supports SMART? |
150 |
|
151 |
- Mark |