1 |
On 3/1/21 3:25 PM, John Blinka wrote: |
2 |
> HI, Gentooers! |
3 |
|
4 |
Hi, |
5 |
|
6 |
> So, I typed dd if=/dev/zero of=/dev/sd<wrong letter>, and despite |
7 |
> hitting ctrl-c quite quickly, zeroed out some portion of the initial |
8 |
> part of a disk. Which did this to my zfs raidz3 array: |
9 |
|
10 |
OOPS!!! |
11 |
|
12 |
> NAME STATE READ WRITE CKSUM |
13 |
> zfs DEGRADED 0 0 0 |
14 |
> raidz3-0 DEGRADED 0 0 0 |
15 |
> ata-HGST_HUS724030ALE640_PK1234P8JJJVKP ONLINE 0 0 0 |
16 |
> ata-HGST_HUS724030ALE640_PK1234P8JJP3AP ONLINE 0 0 0 |
17 |
> ata-ST4000NM0033-9ZM170_Z1Z80P4C ONLINE 0 0 0 |
18 |
> ata-ST4000NM0033-9ZM170_Z1ZAZ8F1 ONLINE 0 0 0 |
19 |
> 14296253848142792483 UNAVAIL 0 0 |
20 |
> 0 was /dev/disk/by-id/ata-ST4000NM0033-9ZM170_Z1ZAZDJ0-part1 |
21 |
> ata-ST4000NM0033-9ZM170_Z1Z80KG0 ONLINE 0 0 0 |
22 |
|
23 |
Okay. So the pool is online and the data is accessible. That's |
24 |
actually better than I originally thought. -- I thought you had |
25 |
accidentally damaged part of the ZFS partition that existed on a single |
26 |
disk. -- I've been able to repair this with minimal data loss (zeros) |
27 |
with Oracle's help on Solaris in the past. |
28 |
|
29 |
Aside: My understanding is that ZFS stores multiple copies of it's |
30 |
metadata on the disk (assuming single disk) and that it is possible to |
31 |
recover a pool if any one (or maybe two for consistency checks) are |
32 |
viable. Though doing so is further into the weeds than you normally |
33 |
want to be. |
34 |
|
35 |
> Could have been worse. I do have backups, and it is raid3, so all I've |
36 |
> injured is my pride, but I do want to fix things. I'd appreciate |
37 |
> some guidance before I attempt doing this - I have no experience at |
38 |
> it myself. |
39 |
|
40 |
First, your pool / it's raidz3 is only 'DEGRADED', which means that the |
41 |
data is still accessible. 'OFFLINE' would be more problematic. |
42 |
|
43 |
> The steps I envision are |
44 |
> |
45 |
> 1) zpool offline zfs 14296253848142792483 (What's that number?) |
46 |
|
47 |
I'm guessing it's an internal ZFS serial number. You will probably need |
48 |
to reference it. |
49 |
|
50 |
I see no reason to take the pool offline. |
51 |
|
52 |
> 2) do something to repair the damaged disk |
53 |
|
54 |
I don't think you need to do anything at the individual disk level yet. |
55 |
|
56 |
> 3) zpool online zfs <repaired disk> |
57 |
|
58 |
I think you can fix this with the pool online. |
59 |
|
60 |
> Right now, the device name for the damaged disk is /dev/sda. |
61 |
> Gdisk says this about it: |
62 |
> |
63 |
> Caution: invalid main GPT header, |
64 |
|
65 |
This is to be expected. |
66 |
|
67 |
> but valid backup; regenerating main header from backup! |
68 |
|
69 |
This looks promising. |
70 |
|
71 |
> Warning: Invalid CRC on main header data; loaded backup partition table. |
72 |
> Warning! Main and backup partition tables differ! Use the 'c' and 'e' options |
73 |
> on the recovery & transformation menu to examine the two tables. |
74 |
|
75 |
I'm assuming that the main partition table is at the start of the disk |
76 |
and that it's what got wiped out. |
77 |
|
78 |
So I'd think that you can look at the 'c' and 'e' options on the |
79 |
recovery & transformation menu for options to repair the main partition |
80 |
table. |
81 |
|
82 |
> Warning! Main partition table CRC mismatch! Loaded backup partition table |
83 |
> instead of main partition table! |
84 |
|
85 |
I know. Thank you for using the backup partition table. |
86 |
|
87 |
> Warning! One or more CRCs don't match. You should repair the disk! |
88 |
|
89 |
I'm guessing that this is a direct result of the dd oops. I would want |
90 |
more evidence to support it being a larger problem. |
91 |
|
92 |
The CRC may be calculated over a partially zeroed chunk of disk. (Chunk |
93 |
because I don't know what term is best here and I want to avoid implying |
94 |
anything specific or incorrectly.) |
95 |
|
96 |
> Main header: ERROR |
97 |
> Backup header: OK |
98 |
> Main partition table: ERROR |
99 |
> Backup partition table: OK |
100 |
|
101 |
ACK |
102 |
|
103 |
> Partition table scan: |
104 |
> MBR: not present |
105 |
> BSD: not present |
106 |
> APM: not present |
107 |
> GPT: damaged |
108 |
> |
109 |
> Found invalid MBR and corrupt GPT. What do you want to do? (Using the |
110 |
> GPT MAY permit recovery of GPT data.) |
111 |
> 1 - Use current GPT |
112 |
> 2 - Create blank GPT |
113 |
> |
114 |
> Your answer: ( I haven't given one yet) |
115 |
|
116 |
I'd assume #1, Use current GPT. |
117 |
|
118 |
> I'm not exactly sure what this is telling me. But I'm guessing it |
119 |
> means that the main partition table is gone, but there's a good |
120 |
> backup. |
121 |
|
122 |
That's my interpretation too. |
123 |
|
124 |
It jives with the description of what happened. |
125 |
|
126 |
> In addition, some, but not all disk id info is gone: |
127 |
> 1) /dev/disk/by-id still shows ata-ST4000NM0033-9ZM170_Z1ZAZDJ0 |
128 |
> (the damaged disk) but none of its former partitions |
129 |
|
130 |
The disk ID still being there may be a symptom / side effect of when |
131 |
udev creates the links. I would expect it to not be there post-reboot. |
132 |
|
133 |
Well, maybe. The disk serial number is independent of any data on the disk. |
134 |
|
135 |
Partitions by ID would probably be gone post reboot (or eject and |
136 |
re-insertion). |
137 |
|
138 |
> 2) /dev/disk/by-partlabel shows entries for the undamaged disks in |
139 |
> the pool, but not the damaged one |
140 |
|
141 |
Okay. That means that udev is recognizing the change faster than I |
142 |
would have expected. |
143 |
|
144 |
That probably means that the ID in #1 has survived any such update. |
145 |
|
146 |
> 3) /dev/disk/by-partuuid similar to /dev/disk/by-partlabel |
147 |
|
148 |
Given #2, I'm not surprised at #3. |
149 |
|
150 |
> 4) /dev/disk/by-uuid does not show the damaged disk |
151 |
|
152 |
Hum. |
153 |
|
154 |
> This particular disk is from a batch of 4 I bought with the same make |
155 |
> and specification and very similar ids (/dev/disk/by-id). Can I |
156 |
> repair this disk by copying something off one of those other disks |
157 |
> onto this one? |
158 |
|
159 |
Maybe. But I would not bother. (See below.) |
160 |
|
161 |
> Is repair just repartitioning - as in the Gentoo handbook? Is it |
162 |
> as simple as running gdisk and typing 1 to accept gdisk's attempt at |
163 |
> recovering the gpt? Is running gdisk's recovery and transformation |
164 |
> facilities the way to go (the b option looks like it's made for |
165 |
> exactly this situation)? |
166 |
|
167 |
gdisk will address the partition problem. But that doesn't do anything |
168 |
for ZFS. |
169 |
|
170 |
> Anybody experienced at this and willing to guide me? |
171 |
|
172 |
I've not dealt with this particular problem. But I have dealt with a |
173 |
few different things. |
174 |
|
175 |
My course of action would be: |
176 |
|
177 |
0) Copy the entire disk to another disk if possible and if you are |
178 |
sufficiently paranoid. |
179 |
1) Let gdisk repair the main partition table using the data from the |
180 |
backup partition table. |
181 |
2) Leverage ZFS's ZRAID functionality to recover the ZFS data. |
182 |
|
183 |
I /think/ that #2 can be done with one command. Do your homework to |
184 |
understand, check, and validate this. You are responsible for your own |
185 |
actions, despite what some random on the Internet says. ;-) |
186 |
|
187 |
# zpool replace 14296253848142792483 sda |
188 |
|
189 |
Assuming that /dev/sda is the corrupted disk. |
190 |
|
191 |
This will cause ZFS to remove the 14296253848142792483 disk from the |
192 |
pool and rebuild onto the (/dev/)sda disk. -- ZFS doesn't care that |
193 |
they are the same disk. |
194 |
|
195 |
You can keep track of the resilver with something like the following: |
196 |
|
197 |
# while true; do zpool status zfs; sleep 60; done |
198 |
|
199 |
Since your pool is only 'DEGRADED', you are probably in an okay |
200 |
position. It's just a matter of not making things worse while trying to |
201 |
make them better. |
202 |
|
203 |
Given that you have a RAIDZ3 and all of the other disks are ONLINE, your |
204 |
data should currently be safe. |
205 |
|
206 |
|
207 |
|
208 |
-- |
209 |
Grant. . . . |
210 |
unix || die |