1 |
On Tuesday, 8 November 2022 17:55:51 GMT Laurence Perkins wrote: |
2 |
> >-----Original Message----- |
3 |
> >From: Grant Edwards <grant.b.edwards@×××××.com> |
4 |
> >Sent: Tuesday, November 8, 2022 6:28 AM |
5 |
> >To: gentoo-user@l.g.o |
6 |
> >Subject: [gentoo-user] Re: e2fsck -c when bad blocks are in existing file? |
7 |
> > |
8 |
> >On 2022-11-08, Michael <confabulate@××××××××.com> wrote: |
9 |
> >> On Tuesday, 8 November 2022 03:31:07 GMT Grant Edwards wrote: |
10 |
> >>> I've got an SSD that's failing, and I'd like to know what files |
11 |
> >>> contain bad blocks so that I don't attempt to copy them to the |
12 |
> >>> replacement disk. |
13 |
> >>> |
14 |
> >>> According to e2fsck(8): |
15 |
> >>> -c This option causes e2fsck to use badblocks(8) program to |
16 |
> >>> do |
17 |
> >>> |
18 |
> >>> a read-only scan of the device in order to find any bad blocks. If |
19 |
> >>> |
20 |
> >>> any bad blocks are found, they are added to the bad block inode to |
21 |
> >>> prevent them from being allocated to a file or directory. If this |
22 |
> >>> option is specified twice, then the bad block scan will be done |
23 |
> >>> using a non-destructive read-write test. |
24 |
> >>> |
25 |
> >>> What happens when the bad block is _already_allocated_ to a file? |
26 |
> >> |
27 |
> >> Previously allocated to a file and now re-allocated or not, my |
28 |
> >> understanding is with spinning disks the data in a bad block stays |
29 |
> >> there unless you've dd'ed some zeros over it. Even then read or write |
30 |
> >> operations could fail if the block is too far gone.[1] Some data |
31 |
> >> recovery applications will try to read data off a bad block in |
32 |
> >> different patterns to retrieve what's there. Once the bad block is |
33 |
> >> categorized as such it won't be used by the filesystem to write new data |
34 |
> >> to it again.> |
35 |
> >Thanks. I guess I should have been more specific in my question. |
36 |
> > |
37 |
> >What does e2fsck -c do to the filesystem structure when it discovers a bad |
38 |
> >block that is already allocated to an existing inode? |
39 |
> > |
40 |
> >Is the inode's chain of block groups left as is -- still containing the bad |
41 |
> >block that (according to the man page) "has been added to the bad block |
42 |
> >inode"? Presumably not, since a block can't be allocated to two different |
43 |
> >inodes. |
44 |
> > |
45 |
> >Is the "broken" file split into two chunks (before/after the bad |
46 |
> >block) and moved to the lost-and-found? |
47 |
> > |
48 |
> >Is the man page's description only correct when the bad block is currently |
49 |
> >unallocated? |
50 |
> > |
51 |
> >-- |
52 |
> >Grant |
53 |
> |
54 |
> If I recall correctly, it will add any unreadable blocks to its internal |
55 |
> list of bad sectors, which it will then refuse to allocate in the future. |
56 |
> |
57 |
> I don't believe it will attempt to move the file to elsewhere until it is |
58 |
> written since: A) what would you then put in that block? You don't know |
59 |
> the contents. B) Moving the file around would make attempts to recover the |
60 |
> data from that bad sector significantly more difficult. |
61 |
|
62 |
As far as I know trying to write raw data directly to a bad block e.g. with dd |
63 |
or hdparm will trigger the disk's controller firmware to reallocate the data |
64 |
from the bad block to a spare. I always thought e2fsck won't write data in a |
65 |
block unless it is empty. badblocks -w will write test patterns to blocks and |
66 |
also trigger data reallocation on any bad blocks. badblocks -n, which |
67 |
corresponds to e2fsck -cc will only write to empty blocks and it may or may |
68 |
not trigger a firmware reallocation. |
69 |
|
70 |
I'm not sure what happens at a filesystem level, when one bad block within an |
71 |
extent is reallocated. The extent and the previously contiguous blocks will |
72 |
no longer be contiguous. Does the hardware expose some SMART data to inform |
73 |
the OS/fs of the reallocated block, to perform a whole extent remapping? |
74 |
|
75 |
|
76 |
> This is, however, very unlikely to come up on a modern disk since most of |
77 |
> them automatically remap failed sectors at the hardware level (also on |
78 |
> write, for the same reasons). So the only time it would matter is if you |
79 |
> have a disk that's more than about 20 years old, or one that's used up all |
80 |
> its spare sectors... |
81 |
> |
82 |
> Unless, of course, you're resurrecting the old trick of marking a section of |
83 |
> the disk as "bad" so the FS won't touch it, and then using it for raw data |
84 |
> of some kind... |
85 |
> |
86 |
> You can, of course, test it yourself to be certain with a loopback file and |
87 |
> a fake "badblocks" that just outputs your chosen list of bad sectors and |
88 |
> then see if any of the data moves. I'd say like a 2MB filesystem and write |
89 |
> a file full of 00DEADBEEF, then make a copy, blacklist some sectors, and |
90 |
> hit it with your favorite binary diff command and see what moved. This is |
91 |
> probably recommended since there could be differences between the behaviour |
92 |
> of different versions of e2fsck. |
93 |
> |
94 |
> LMP |