Gentoo Archives: gentoo-user

From: Mark Knecht <markknecht@×××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] My PC died. What should I try?
Date: Fri, 17 Aug 2012 17:57:27
Message-Id: CAK2H+ec0=ChDqDqFJW1QXi8ESvfG7EZt6DRqW6R3ixHBgvkiyw@mail.gmail.com
In Reply to: [gentoo-user] My PC died. What should I try? by Alex Schuster
1 On Fri, Aug 17, 2012 at 12:50 AM, Alex Schuster <wonko@×××××××××.org> wrote:
2 > Hi there!
3 >
4 > Two days ago, my PC suddenly died, after working fine for half a year. I
5 > used myrtcwake as usual to suspend to RAM, and it woke up in the morning.
6 > But after two minutes, the screen went blank and nothing, even SysRq, gave a
7 > reaction. I tried booting a couple of times again, and sometimes it did not
8 > even reach KDM. Now, I cannot even run Grub (from my USB stick) any more, I
9 > only see a "GRUB" string at the top right, then nothing happens.
10 >
11 > Booting with SystemRescueCD also freezes sometimes. If not, I can make it
12 > freeze after seconds by running 'memtester'.
13 >
14 > Booting good old memtest86 ran for an hour and only found one error, then I
15 > aborted, removed three of my four memory modules (4GB each), and tried
16 > different ones in the first bank. Memtest86 again did not find much errors,
17 > but froze once. Running memtester after booting from SystemrescueCD again
18 > makes the thing freeze in seconds. It once also froze while being in the
19 > BIOs setup.
20 >
21 > What could be the problem? CPU, board, or even the PSU? I do not think it
22 > has to do with bad memory. I removed most of the other stuff (hard drives,
23 > PCI cards). I have no similar hardware so I cannot simply exchange things,
24 > the question is what to buy and try. How would you proceed?
25 >
26 > The fan is still working, the cooler does not become hot, and in the BIOS
27 > there are not high temperatures begin reported. But one thing was strange: I
28 > updated Calligra from 2.4 to 2.5 (I think), and it took ages, at least 8
29 > hours. I thought there may b something strange with the build process of
30 > this new version, forcing MAKEOPTS=-j1 and such, but still this is very
31 > long. But when working with it, I did not notice anything strange like
32 > sluggish reactions, and videos played fine. But I did not use it as much as
33 > I normally do, and maybe even when overheated and throttled down it would
34 > have been fast enough for me to not notice this. I watch the syslog
35 > normally, but maybe I just did not look closely that day, I was busy doing
36 > other stuff.
37 >
38 > CPUs don't just die, do they? Even when overheating, I think these days
39 > throttle down, so no permanent harm should be done? So maybe it's the board?
40 > It looks okay, no bent or leaking capacitors.
41 >
42 > This is really annoying. Of course most of my passwords are in my KDE wallet
43 > I cannot access. There's also Wiki, CVS and Git repositories, not needed
44 > every day, but still important. And the timinig is very bad, I just started
45 > my new job the day the problem happened, and I do not have much time for
46 > this now. Before, I was working at home, so I would have had all day to
47 > diagnose and try things.
48 >
49 > It's an AMD FX-4100 Quad-Core CPU, and an ASRock 880GMH/U3S3 board.
50 >
51 > Wonko
52 >
53
54 Hi Alex,
55 Sorry for the problems.
56
57 I've read most of the responses so it seems you're getting good
58 info. A few things:
59
60 1) You asked "CPUs don't just die, do they?". The answer is 'yes, they
61 do.' It can happen at any time:
62
63 http://en.wikipedia.org/wiki/Bathtub_curve
64
65 2) If I understand your post, along with the other discussions, it
66 seems that you can remove all cards and all memory except 1 DIMM and
67 boot the machine to BIOS. Is that correct? If so then your CPU isn't
68 completely dead.
69
70 3) As you are seeing some memory problems it might be that memory
71 died. (see bathtub curve again - it applies to everything.) However it
72 seems very unlikely that all memory died at the same time. More likely
73 is the the chipset. If you change DIMMs but keep plugging it into the
74 same memory channel then it might be that channel in the chipset
75 that's having trouble. If it's your chipset, you're sunk. Get a new
76 MB.
77
78 As others have suggested the PSU is a potential common problem.
79 With everything else out of the box, memory swapped but the same
80 problem occurring, and the ability to at least get into BIOS, it's
81 likely either the PSU or the MB.
82
83 Good luck,
84 Mark