1 |
On Mon, Dec 27, 2021 at 8:46 AM Wols Lists <antlists@××××××××××××.uk> wrote: |
2 |
> |
3 |
> On 27/12/2021 13:40, Michael wrote: |
4 |
> > On Monday, 27 December 2021 11:32:39 GMT Wols Lists wrote: |
5 |
> >> On 27/12/2021 11:07, Jacques Montier wrote: |
6 |
> >>> Well, i don't know if my partitions are aligned or mis-aligned... How |
7 |
> >>> could i get it ? |
8 |
> >> |
9 |
> >> fdisk would have spewed a bunch of warnings. So you're okay. |
10 |
> >> |
11 |
> >> I'm not sure of the details, but it's the classic "off by one" problem - |
12 |
> >> if there's a mismatch between the kernel block size and the disk block |
13 |
> >> size any writes required doing a read-update-write cycle which of course |
14 |
> >> knackered performance. I had that hit a while back. |
15 |
> >> |
16 |
> >> But seeing as fdisk isn't moaning, that isn't the problem ... |
17 |
> >> |
18 |
> >> Cheers, |
19 |
> >> Wol |
20 |
> > |
21 |
> > I also thought of misaligned boundaries when I first saw the error, but the |
22 |
> > mention of Seagate by the OP pointed me to another edge case which crept up |
23 |
> > with zstd compression on ZFS. I'm mentioning it here in case it is relevant: |
24 |
> > |
25 |
> > https://livelace.ru/posts/2021/Jul/19/unaligned-write-command/ |
26 |
> > |
27 |
> that might be of interest to me ... I'm getting system lockups but it's |
28 |
> not an SSD. I've got two IronWolves and a Barracuda. |
29 |
> |
30 |
> But I notice the OP has a Barra*C*uda. Note the different spelling. |
31 |
> That's a shingled drive I believe, which shouldn't make a lot of |
32 |
> difference in light usage, but you don't want to hammer it! |
33 |
|
34 |
I've run into this issue and I've seen rare reports of it online, but |
35 |
no sign of resolution. I'm pretty sure it is some sort of bug in the |
36 |
kernel. I've tended to see it under load, and mostly when using zfs. |
37 |
I do not use zstd compression and do not have any zvols on the pools |
38 |
that had this issue. So, either there are multiple problems, or that |
39 |
linked post did not correctly identify the root cause (which seems |
40 |
likely). I'm guessing it is triggered under load and perhaps using |
41 |
zstd compression helps create that load. |
42 |
|
43 |
I haven't seen it much lately - probably because I've shifted a lot of |
44 |
my load to lizardfs and also I'm using USB3 hard drives for the bulk |
45 |
of my storage and since these seem to be ATA errors the removal of the |
46 |
SATA host and associated drivers may bypass the problem. |
47 |
|
48 |
I doubt this has anything to do with physical/logical sector size and |
49 |
partition alignment. The disks should still work correctly if the |
50 |
physical sectors aren't aligned - they should just have performance |
51 |
degradation. In any case, all my drives are aligned on physical |
52 |
sector boundaries. I'm not familiar enough with ATA to understand |
53 |
what the actual errors are referring to. |
54 |
|
55 |
Here is an example of one of the errors I've had in the past from one |
56 |
of these situations. A zpool scrub usually clears up any damage and |
57 |
then the drive works normally until the issue happens again (which |
58 |
hasn't happened in quite a while for me now). I have a dump of the |
59 |
SMART logs and the kernel ring buffer: |
60 |
|
61 |
ATA Error Count: 1 |
62 |
CR = Command Register [HEX] |
63 |
FR = Features Register [HEX] |
64 |
SC = Sector Count Register [HEX] |
65 |
SN = Sector Number Register [HEX] |
66 |
CL = Cylinder Low Register [HEX] |
67 |
CH = Cylinder High Register [HEX] |
68 |
DH = Device/Head Register [HEX] |
69 |
DC = Device Command Register [HEX] |
70 |
ER = Error register [HEX] |
71 |
ST = Status register [HEX] |
72 |
Powered_Up_Time is measured from power on, and printed as |
73 |
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, |
74 |
SS=sec, and sss=millisec. It "wraps" after 49.710 days. |
75 |
|
76 |
Error 1 occurred at disk power-on lifetime: 12838 hours (534 days + 22 hours) |
77 |
When the command that caused the error occurred, the device was |
78 |
active or idle. |
79 |
|
80 |
After command completion occurred, registers were: |
81 |
ER ST SC SN CL CH DH |
82 |
-- -- -- -- -- -- -- |
83 |
84 51 e0 88 cc c3 06 Error: ICRC, ABRT at LBA = 0x06c3cc88 = 113495176 |
84 |
|
85 |
Commands leading to the command that caused the error were: |
86 |
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name |
87 |
-- -- -- -- -- -- -- -- ---------------- -------------------- |
88 |
61 00 c0 68 cb c3 40 08 2d+00:45:18.962 WRITE FPDMA QUEUED |
89 |
60 00 b8 98 67 00 40 08 2d+00:45:18.917 READ FPDMA QUEUED |
90 |
60 00 b0 98 65 00 40 08 2d+00:45:18.916 READ FPDMA QUEUED |
91 |
60 00 a8 98 66 00 40 08 2d+00:45:18.916 READ FPDMA QUEUED |
92 |
61 00 a0 68 ca c3 40 08 2d+00:45:18.879 WRITE FPDMA QUEUED |
93 |
|
94 |
[354064.268896] ata6.00: exception Emask 0x11 SAct 0x1000000 SErr |
95 |
0x480000 action 0x6 frozen |
96 |
[354064.268907] ata6.00: irq_stat 0x48000008, interface fatal error |
97 |
[354064.268910] ata6: SError: { 10B8B Handshk } |
98 |
[354064.268915] ata6.00: failed command: WRITE FPDMA QUEUED |
99 |
[354064.268919] ata6.00: cmd 61/00:c0:68:cb:c3/07:00:06:01:00/40 tag |
100 |
24 ncq dma 917504 out |
101 |
res 50/00:00:68:cb:c3/00:07:06:01:00/40 Emask |
102 |
0x10 (ATA bus error) |
103 |
[354064.268922] ata6.00: status: { DRDY } |
104 |
[354064.268926] ata6: hard resetting link |
105 |
[354064.731093] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300) |
106 |
[354064.734739] ata6.00: configured for UDMA/133 |
107 |
[354064.734759] sd 5:0:0:0: [sdc] tag#24 FAILED Result: |
108 |
hostbyte=DID_OK driverbyte=DRIVER_SENSE |
109 |
[354064.734764] sd 5:0:0:0: [sdc] tag#24 Sense Key : Illegal Request [current] |
110 |
[354064.734767] sd 5:0:0:0: [sdc] tag#24 Add. Sense: Unaligned write command |
111 |
[354064.734771] sd 5:0:0:0: [sdc] tag#24 CDB: Write(16) 8a 00 00 00 00 |
112 |
01 06 c3 cb 68 00 00 07 00 00 00 |
113 |
[354064.734774] print_req_error: I/O error, dev sdc, sector 4408462184 |
114 |
[354064.734791] ata6: EH complete |
115 |
|
116 |
|
117 |
-- |
118 |
Rich |