1 |
Mark Knecht writes: |
2 |
|
3 |
> Do I just watch the logs looking for problems? I have no way of |
4 |
> knowing right now whether this was a disk problem that's going to come |
5 |
> back, a 1 time deal due to power, or something else entirely. |
6 |
> |
7 |
> As these cheap machines that don't use RAID what's the right way to |
8 |
> go? emerge -e @world and then wait for the next event? Do nothing and |
9 |
> wait? |
10 |
|
11 |
Emerge smartmontools, then: |
12 |
|
13 |
smartctl -h /dev/sda # get overview of what the drive thinks about itself |
14 |
|
15 |
smartctl -t short /dev/sda # start short self test |
16 |
Wait |
17 |
smartctl -l selftest /dev/sda # see results |
18 |
|
19 |
smartctl -t long /dev/sda # start long self test |
20 |
Wait a lot longer |
21 |
smartctl -l selftest /dev/sda # see results |
22 |
|
23 |
You can continue working in the meanwhile, there will be no performance |
24 |
impact. You will see something like this in the log: |
25 |
|
26 |
=== START OF READ SMART DATA SECTION === |
27 |
SMART Self-test log structure revision number 1 |
28 |
Num Test_Description Status Remaining LifeTime(hours) |
29 |
LBA_of_first_error |
30 |
# 1 Short offline Completed without error 00% 2275 - |
31 |
# 2 Extended offline Completed without error 00% 2270 - |
32 |
# 3 Extended offline Completed without error 00% 1799 - |
33 |
# 4 Extended offline Completed without error 00% 197 - |
34 |
# 5 Extended offline Completed without error 00% 26 - |
35 |
|
36 |
I you have a '-' in the right column, the disk has found no errors. If |
37 |
there is a number, than it's the position of the first error. |
38 |
|
39 |
There's also badblocks, this will check every block and output the bad |
40 |
ones: badblocks -sv /dev/sda |
41 |
|
42 |
badblocks -svn /dev/sda will do a read-write test. In case of a bad block, |
43 |
the drive should exchange it with a spare one. Maybe this happens already |
44 |
in read-only mode, I am not sure. |
45 |
|
46 |
Also watch for errors in syslog or via dmesg, there should be some when |
47 |
bad blocks are being accessed. |
48 |
|
49 |
Wonko |