Re: [gentoo-user] [OT] Badblocks on my harddisk - gentoo-user

From:	meino.cramer@×××.de
To:	gentoo-user@l.g.o
Subject:	Re: [gentoo-user] [OT] Badblocks on my harddisk
Date:	Sat, 26 Jul 2014 08:13:36
Message-Id:	`20140726081325.GA3835@solfire`
In Reply to:	Re: [gentoo-user] [OT] Badblocks on my harddisk by Dale

1

Dale <rdalek1967@×××××.com> [14-07-26 09:54]:

2

> meino.cramer@×××.de wrote:

3

> > Hi,

4

> >

5

> > After running smartctl for an extended offline test I got

6

> > a badblock (information extracted from the report):

7

> >

8

> > SMART Self-test log structure revision number 1

9

> > Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

10

> > # 1  Extended offline    Completed: read failure       90%     14460         4288352511

11

> > 197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1

12

> >

13

> > I found a explanation to map the LBA to a partition here:

14

> > http://smartmontools.sourceforge.net/badblockhowto.html

15

> >

16

> > My partition layout is:

17

> > #> sudo fdisk -lu /dev/sda

18

> >

19

> > Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors

20

> > Units: sectors of 1 * 512 = 512 bytes

21

> > Sector size (logical/physical): 512 bytes / 512 bytes

22

> > I/O size (minimum/optimal): 512 bytes / 512 bytes

23

> > Disklabel type: dos

24

> > Disk identifier: 0x07ec16a2

25

> >

26

> > Device     Boot      Start        End    Blocks  Id System

27

> > /dev/sda1  *          2048     104447     51200  83 Linux

28

> > /dev/sda2           104448   12687359   6291456  82 Linux swap / Solaris

29

> > /dev/sda3         12687360  222402559 104857600  83 Linux

30

> > /dev/sda4        222402560 1953525167 865561304   5 Extended

31

> > /dev/sda5        222404608  232890367   5242880  83 Linux

32

> > /dev/sda6        232892416  442607615 104857600  83 Linux

33

> > /dev/sda7        442609664  652324863 104857600  83 Linux

34

> > /dev/sda8        652326912  862042111 104857600  83 Linux

35

> > /dev/sda9        862044160 1071759359 104857600  83 Linux

36

> > /dev/sda10      1071761408 1281476607 104857600  83 Linux

37

> > /dev/sda11      1281478656 1491193855 104857600  83 Linux

38

> > /dev/sda12      1491195904 1953525167 231164632  83 Linux

39

> >                 4288352511  <<< The number reported by smartctl

40

> >

41

> >

42

> > Following the linked document...

43

> > It seems the bad LBA is not on the checked harddisk.

44

> >

45

> > Or (more obvious) I did something wrong...

46

> >

47

> > How can I correctly identify the partition, which contains the bad

48

> > block?

49

> > How can I get a full list of all bad blocks (if any) from a mounted

50

> > file systems?

51

> > How severe is the problem?

52

> >

53

> > Thank you very much for any help in advance!

54

> > Best regards,

55

> > mcc

56

> >

57

>

58

> I ran into this recently on the drive that has my home partition on it.

59

> Someone posted that it *may* be fixable without moving data etc etc.  I

60

> didn't have a backup at the time and nothing large enough to make one so

61

> I just ordered a new drive.  When I got the new drive in and moved my

62

> data over, then I played with the drive a bit.  I used dd to erase the

63

> drive, then stuck a file system back on it and filled it up.  After

64

> doing that, the drive seems to have marked that part as bad and doesn't

65

> use it anymore.  It has passed every test since then.

66

>

67

> My point is this, backups for sure just in case but you may be able to

68

> get the drive to mark that area as bad by moving that data off there.

69

> In my case, the files were corrupted and gone.  Yea, I might could have

70

> sent it somewhere but I ain't into that.  To much money for files I can

71

> replace if needed.  I think it was like 3 or 4 video files.  I'd find

72

> out what files are there, see what damage has occurred so that you can

73

> correct later, then find one really good howto and follow it.   From my

74

> understanding, if you can move that data in the bad spot off there, the

75

> drive sort of fixes itself.  If yours works like mine did, you should be

76

> OK but I'd use it for stuff that ain't so important.  I use mine as a

77

> backup drive and test it a lot.  ;-)  I may trust it again, one day.

78

>

79

> So, most likely you will have some files corrupted at least.  The drive

80

> *may* be fixable if you can figure out what files to move so that the

81

> drive can do its magic.  Key thing is, finding out what to move so that

82

> the drive can do its work.  Two options, try to move files so the drive

83

> can do its thing or move all the data to another drive, do like I did

84

> mine with dd and give it a fresh start that way.   I didn't feel I had

85

> the experience to try and move the files so I took the 2nd option.  Now

86

> I wish I had done option #1 and took notes that I could pass on.  That

87

> would likely help you more.

88

>

89

> BTW, my drive gave that error for weeks and never got worse.  I could be

90

> lucky on that one so do what needs doing as soon as you can, just in

91

> case.  The last drive that really failed on me years ago, I got a

92

> serious warning from SMART.  It even said I had like 24 hours to get my

93

> data off.  It needs attention in your case but hopefully you will have

94

> the results I did in the end and you have time to deal with it.

95

>

96

> Dale

97

>

98

> :-)  :-)

99

>

100

101

102

Hi Dale,

103

104

thank you very much for the explanations you gave...and for the hope

105

in it ;) :)

106

107

In the meanwhile I found ddrescue... :)

108

109

It took me five hours to copy the disk (1T) binaryly (this word looks

110

wrong...) to another identical one with ddrescue. This beast is

111

smart...it first copies all what it is able to read instantly and

112

writes out a logfile, which contains the informations, what is wrong

113

with the disk and where:

114

115

116

# Rescue Logfile. Created by GNU ddrescue version 1.16

117

# Command line: ddrescue -f -n /dev/sda /dev/sdb ddrescue.log

118

# current_pos  current_status

119

0x36220000     +

120

#      pos        size  status

121

0x00000000  0x3621F000  +

122

0x3621F000  0x00000E00  /

123

0x3621FE00  0x00000200  -

124

0x36220000  0xE8AAB96000  +

125

126

In my case it report one errornous read and a defective size of 4096

127

bytes.

128

129

After that it is called a second time with different parameters and

130

the name of the logfile.

131

132

It then tries to read the sector again and retries it several times.

133

after that the logfile looks like this:

134

135

136

# Rescue Logfile. Created by GNU ddrescue version 1.16

137

# Command line: ddrescue -d -f -r3 /dev/sda /dev/sdb ddrescue.log

138

# current_pos  current_status

139

0x3621FE00     +

140

#      pos        size  status

141

0x00000000  0x3621F000  +

142

0x3621F000  0x00001000  -

143

0x36220000  0xE8AAB96000  +

144

145

What has been fixed has gone from the logfile.

146

147

So there is something left...

148

149

I will start a complete smartctl scan again and will see, whether

150

the bad block has been mapped and replaced.

151

152

I *hope* that this is a single accident, because only one spot (and a

153

small one) one a 1T disk is affected...we will see (fingers crossed).

154

155

By the way: There are other tools similiar to ddrescue called

156

dd_rescue and similiar. I found dddrescue recommended over the others

157

on the net.

158

159

Now....will start a smartctl complete check...this will take hours...

160

I will report later, what happened...

161

162

Best regards and have a nice weekend!

163

mcc

Gentoo Archives: gentoo-user

Replies

1	Dale <rdalek1967@×××××.com> [14-07-26 09:54]:
2	> meino.cramer@×××.de wrote:
3	> > Hi,
4	> >
5	> > After running smartctl for an extended offline test I got
6	> > a badblock (information extracted from the report):
7	> >
8	> > SMART Self-test log structure revision number 1
9	> > Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
10	> > # 1 Extended offline Completed: read failure 90% 14460 4288352511
11	> > 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1
12	> >
13	> > I found a explanation to map the LBA to a partition here:
14	> > http://smartmontools.sourceforge.net/badblockhowto.html
15	> >
16	> > My partition layout is:
17	> > #> sudo fdisk -lu /dev/sda
18	> >
19	> > Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
20	> > Units: sectors of 1 * 512 = 512 bytes
21	> > Sector size (logical/physical): 512 bytes / 512 bytes
22	> > I/O size (minimum/optimal): 512 bytes / 512 bytes
23	> > Disklabel type: dos
24	> > Disk identifier: 0x07ec16a2
25	> >
26	> > Device Boot Start End Blocks Id System
27	> > /dev/sda1 * 2048 104447 51200 83 Linux
28	> > /dev/sda2 104448 12687359 6291456 82 Linux swap / Solaris
29	> > /dev/sda3 12687360 222402559 104857600 83 Linux
30	> > /dev/sda4 222402560 1953525167 865561304 5 Extended
31	> > /dev/sda5 222404608 232890367 5242880 83 Linux
32	> > /dev/sda6 232892416 442607615 104857600 83 Linux
33	> > /dev/sda7 442609664 652324863 104857600 83 Linux
34	> > /dev/sda8 652326912 862042111 104857600 83 Linux
35	> > /dev/sda9 862044160 1071759359 104857600 83 Linux
36	> > /dev/sda10 1071761408 1281476607 104857600 83 Linux
37	> > /dev/sda11 1281478656 1491193855 104857600 83 Linux
38	> > /dev/sda12 1491195904 1953525167 231164632 83 Linux
39	> > 4288352511 <<< The number reported by smartctl
40	> >
41	> >
42	> > Following the linked document...
43	> > It seems the bad LBA is not on the checked harddisk.
44	> >
45	> > Or (more obvious) I did something wrong...
46	> >
47	> > How can I correctly identify the partition, which contains the bad
48	> > block?
49	> > How can I get a full list of all bad blocks (if any) from a mounted
50	> > file systems?
51	> > How severe is the problem?
52	> >
53	> > Thank you very much for any help in advance!
54	> > Best regards,
55	> > mcc
56	> >
57	>
58	> I ran into this recently on the drive that has my home partition on it.
59	> Someone posted that it may be fixable without moving data etc etc. I
60	> didn't have a backup at the time and nothing large enough to make one so
61	> I just ordered a new drive. When I got the new drive in and moved my
62	> data over, then I played with the drive a bit. I used dd to erase the
63	> drive, then stuck a file system back on it and filled it up. After
64	> doing that, the drive seems to have marked that part as bad and doesn't
65	> use it anymore. It has passed every test since then.
66	>
67	> My point is this, backups for sure just in case but you may be able to
68	> get the drive to mark that area as bad by moving that data off there.
69	> In my case, the files were corrupted and gone. Yea, I might could have
70	> sent it somewhere but I ain't into that. To much money for files I can
71	> replace if needed. I think it was like 3 or 4 video files. I'd find
72	> out what files are there, see what damage has occurred so that you can
73	> correct later, then find one really good howto and follow it. From my
74	> understanding, if you can move that data in the bad spot off there, the
75	> drive sort of fixes itself. If yours works like mine did, you should be
76	> OK but I'd use it for stuff that ain't so important. I use mine as a
77	> backup drive and test it a lot. ;-) I may trust it again, one day.
78	>
79	> So, most likely you will have some files corrupted at least. The drive
80	> may be fixable if you can figure out what files to move so that the
81	> drive can do its magic. Key thing is, finding out what to move so that
82	> the drive can do its work. Two options, try to move files so the drive
83	> can do its thing or move all the data to another drive, do like I did
84	> mine with dd and give it a fresh start that way. I didn't feel I had
85	> the experience to try and move the files so I took the 2nd option. Now
86	> I wish I had done option #1 and took notes that I could pass on. That
87	> would likely help you more.
88	>
89	> BTW, my drive gave that error for weeks and never got worse. I could be
90	> lucky on that one so do what needs doing as soon as you can, just in
91	> case. The last drive that really failed on me years ago, I got a
92	> serious warning from SMART. It even said I had like 24 hours to get my
93	> data off. It needs attention in your case but hopefully you will have
94	> the results I did in the end and you have time to deal with it.
95	>
96	> Dale
97	>
98	> :-) :-)
99	>
100
101
102	Hi Dale,
103
104	thank you very much for the explanations you gave...and for the hope
105	in it ;) :)
106
107	In the meanwhile I found ddrescue... :)
108
109	It took me five hours to copy the disk (1T) binaryly (this word looks
110	wrong...) to another identical one with ddrescue. This beast is
111	smart...it first copies all what it is able to read instantly and
112	writes out a logfile, which contains the informations, what is wrong
113	with the disk and where:
114
115
116	# Rescue Logfile. Created by GNU ddrescue version 1.16
117	# Command line: ddrescue -f -n /dev/sda /dev/sdb ddrescue.log
118	# current_pos current_status
119	0x36220000 +
120	# pos size status
121	0x00000000 0x3621F000 +
122	0x3621F000 0x00000E00 /
123	0x3621FE00 0x00000200 -
124	0x36220000 0xE8AAB96000 +
125
126	In my case it report one errornous read and a defective size of 4096
127	bytes.
128
129	After that it is called a second time with different parameters and
130	the name of the logfile.
131
132	It then tries to read the sector again and retries it several times.
133	after that the logfile looks like this:
134
135
136	# Rescue Logfile. Created by GNU ddrescue version 1.16
137	# Command line: ddrescue -d -f -r3 /dev/sda /dev/sdb ddrescue.log
138	# current_pos current_status
139	0x3621FE00 +
140	# pos size status
141	0x00000000 0x3621F000 +
142	0x3621F000 0x00001000 -
143	0x36220000 0xE8AAB96000 +
144
145	What has been fixed has gone from the logfile.
146
147	So there is something left...
148
149	I will start a complete smartctl scan again and will see, whether
150	the bad block has been mapped and replaced.
151
152	I hope that this is a single accident, because only one spot (and a
153	small one) one a 1T disk is affected...we will see (fingers crossed).
154
155	By the way: There are other tools similiar to ddrescue called
156	dd_rescue and similiar. I found dddrescue recommended over the others
157	on the net.
158
159	Now....will start a smartctl complete check...this will take hours...
160	I will report later, what happened...
161
162	Best regards and have a nice weekend!
163	mcc