1 |
On Tuesday, 8 November 2022 03:31:07 GMT Grant Edwards wrote: |
2 |
> I've got an SSD that's failing, and I'd like to know what files |
3 |
> contain bad blocks so that I don't attempt to copy them to the |
4 |
> replacement disk. |
5 |
> |
6 |
> According to e2fsck(8): |
7 |
> |
8 |
> -c This option causes e2fsck to use badblocks(8) program to do |
9 |
> a read-only scan of the device in order to find any bad blocks. If any |
10 |
> bad blocks are found, they are added to the bad block inode to prevent |
11 |
> them from being allocated to a file or directory. If this option is |
12 |
> specified twice, then the bad block scan will be done using a |
13 |
> non-destructive read-write test. |
14 |
> |
15 |
> What happens when the bad block is _already_allocated_ to a file? |
16 |
> |
17 |
> -- |
18 |
> Grant |
19 |
|
20 |
Previously allocated to a file and now re-allocated or not, my understanding |
21 |
is with spinning disks the data in a bad block stays there unless you've dd'ed |
22 |
some zeros over it. Even then read or write operations could fail if the |
23 |
block is too far gone.[1] Some data recovery applications will try to read |
24 |
data off a bad block in different patterns to retrieve what's there. Once the |
25 |
bad block is categorized as such it won't be used by the filesystem to write |
26 |
new data to it again. |
27 |
|
28 |
With SSDs the situation is less deterministic, because the disk's internal |
29 |
wear levelling firmware moves things around according to its algorithms to |
30 |
remap bad blocks. This is all transparent to the filesystem, block addresses |
31 |
sent to the fs are virtual anyway. Bypassing the firmware controller to |
32 |
access individual cells on an SSD requires specialist equipment and your own |
33 |
lab, although things may have evolved since I last looked into this. |
34 |
|
35 |
The general advice is to avoid powering down an SSD which is suspected of |
36 |
corruption, until all the data is copied/recovered off it first. If you power |
37 |
it down, data on it may never be accessible again without the aforementioned |
38 |
lab. |
39 |
|
40 |
BTW, running badblocks in read-write mode on an ailing/aged SSD may exacerbate |
41 |
the problem without much benefit by accelerating wear and causing additional |
42 |
cells to fail. At the same time you could be relying on the suspect disk |
43 |
firmware to access via its virtual map the data on some of its cells. Data |
44 |
scrubbing (btrfs, zfs) and recent backups would probably be a better strategy |
45 |
with SSDs. |
46 |
|
47 |
|
48 |
[1] https://www.smartmontools.org/wiki/BadBlockHowto |