Gentoo Archives: gentoo-user

From: Rich Freeman <rich0@g.o>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Re: Emerge --sync source
Date: Thu, 07 Mar 2019 14:45:52
Message-Id: CAGfcS_=gUNLap5vDDQ8zkzh1vfFU1WE=TKAy=6=HF3UgYnTyzA@mail.gmail.com
In Reply to: [gentoo-user] Re: Emerge --sync source by Grant Edwards
1 On Thu, Mar 7, 2019 at 9:29 AM Grant Edwards <grant.b.edwards@×××××.com> wrote:
2 >
3 > On 2019-03-07, Mick <michaelkintzios@×××××.com> wrote:
4 >
5 > > I can think of 3 things, but more learned M/L contributors may add to these:
6 > >
7 > > 1. The SATA connection has come loose. With time and movement it can come
8 > > (slightly) adrift. Pushing it back in fully fixes this problem - also see No.
9 > > 2 below.
10 > >
11 > > 2. The physical connector's contacts are beginning to oxidise. Reseat the
12 > > SATA cable connectors both on the drive and any ribbons on the MoBo. This
13 > > usualy cleans any oxidisation.
14 > >
15 > > 3. The AHCI driver is deploying energy saving measures (aka. Aggressive Link
16 > > Power Management - ALPM). Check the output of:
17 > >
18 > > cat /sys/class/scsi_host/host*/link_power_management_policy
19 > >
20 > > If it doesn't say 'max_performance' you'll need to revisit your BIOS settings
21 > > and also PCIEASPM settings in the kernel.
22 > >
23 > > 4. Finally, there is a chance the PSU is playing up.
24 >
25 > Perhaps it's already been mentioned, but failing RAM can cause all
26 > sorts failures that might appear to be failing disks, failing network
27 > cards, failing video cards whatever. I'd run memtest86 for at least
28 > 12 hours just to make sure...
29 >
30
31 Failing RAM or failing power certainly can cause all manner of
32 filesystem and other corruption. I've seen it firsthand and cleaning
33 up from it is a total mess (usually best to restore from backup). I
34 would definitely start with a memory test - if the motherboard is good
35 then you can work outwards from there.
36
37 From what I've heard SSDs can have bizarre failure modes since they
38 interpose a logical layer between the physical storage media and the
39 rest of the system. They're doing wear-leveling and so on behind the
40 scenes, which means that if something goes wrong all kinds of bizarre
41 problems can occur.
42
43 I've also experienced a spinning hard drive exhibit lots of data
44 corruption issues due to a faulty SATA interface (not sure where in
45 the interface it - chipset, port, or cable). ZFS saved me there with
46 detection and resolution of errors, and when I moved the drive to a
47 different HBA it worked fine after a scrub. I'd never seen anything
48 like it before but it really made me appreciate ZFS (btrfs should have
49 also worked) - I don't think mdadm would have had any way to resolve
50 these errors easily, though maybe if I used a hex editor to figure out
51 which drive was the bad one I might have been able to move it, wipe
52 it, then re-add it to the mirror pair and let it rebuild. With ZFS I
53 just got an email complaining about errors from zed and it just kept
54 beating back the hordes until I fixed the connection. I forget if it
55 dropped the drive or not - I didn't have any spares but if I did I
56 suspect it would have swapped it in after enough problems.
57
58 --
59 Rich

Replies

Subject Author
Re: [gentoo-user] Re: Emerge --sync source Mick <michaelkintzios@×××××.com>