1 |
Mauro Maroni <mmaroni@××××××.ar> posted |
2 |
200611082025.25462.mmaroni@××××××.ar, excerpted below, on Wed, 08 Nov |
3 |
2006 20:25:25 -0300: |
4 |
|
5 |
> Well, then I got segfaults compiling other packages, and a couple of |
6 |
> times the machine freezed doing trivial things like browsing the web. |
7 |
> Could this be a hardware issue? RAM seems to be OK as I ran memtest |
8 |
> during the night and did not show any error after 9 hours. |
9 |
|
10 |
That's a classic hardware issue, yes. The cause can be one of several |
11 |
things. Note that there are at least two ways RAM can be bad and memtest |
12 |
checks only one -- memory actually corrupting in storage. From hard |
13 |
experience, I know the other one all too well -- AND know that memtest |
14 |
doesn't catch it AT ALL. That one is memory timing issues, and as |
15 |
memory speeds increase, it's becoming more and more common. Taking my |
16 |
case as an example, the RAM was rated PC3200, but simply wasn't stable at |
17 |
that. Unfortunately, my mobo was new enough at the time, and using the |
18 |
then new AMD64 memory-controller-on-CPU technology, that the BIOS didn't |
19 |
have the usual memory speed tweaking options. After fighting with it for |
20 |
some time, a BIOS upgrade was eventually made available that added these |
21 |
options, and a very simple (with the right BIOS option) tweak to reduce |
22 |
memory clocking from the rated PC3200 (200 MHz DDRed to 400, times 8 bit |
23 |
bus width, equals 3200) to ~PC3000 (183 MHz DDred to 366, times 8, rounds |
24 |
to 3000) eliminated the issue entirely. The system was then rock-stable, |
25 |
even after tweaking some of the detailed individual wait-state settings |
26 |
back up to increase the performance a bit from the defaults. |
27 |
|
28 |
So, before you eliminate memory as a possibility, check your BIOS and try |
29 |
declocking it a notch or two. |
30 |
|
31 |
Actually, all the hardware possibilities trace to the same root, what |
32 |
should be a binary one becoming at times a binary zero, very often due to |
33 |
undervolting. This can be due to speed issues, as with the above or if |
34 |
you overclock your memory or CPU, or power issues, which may occur |
35 |
anywhere in your "power train", from the stuff coming to you from your |
36 |
electricity supplier, to an underpowered computer power supply, to an |
37 |
underpowered single voltage rail on that supply, to an underpowered UPS, |
38 |
to a faulty power regulator on your mobo, to a bad connection somewhere, |
39 |
to simply having to many things connected to the computer at once. Or it |
40 |
can be both power and speed issues, since higher speeds commonly require |
41 |
more power in ordered to remain stable. (This makes perfect sense given |
42 |
that higher speeds mean there's less time to actually bring the transistor |
43 |
to the high voltage "1" state before actually seeing if it is a 1 or a 0, |
44 |
and boosting the supply voltage -- to a point -- can often make it reach |
45 |
that state faster.) |
46 |
|
47 |
So, it should go without saying, but cut the overclocking if you were |
48 |
doing it (and note that overclocking can cause permanent damage even after |
49 |
returning to normal clocking) Next, check your power supply, both at the |
50 |
wall plug and that you are using a good PSU in the computer, sufficiently |
51 |
highly rated and UL Listed (if in the US, substitute the appropriate |
52 |
authority if elsewhere), since it's common knowledge that the rating of |
53 |
many power supplies lacking this listing aren't worth the cost of ink used |
54 |
to print the rating. If you are using a UPS, check that too. |
55 |
|
56 |
Finally, check for overheating. |
57 |
|
58 |
Those are the most common hardware causes of instability. |
59 |
|
60 |
-- |
61 |
Duncan - List replies preferred. No HTML msgs. |
62 |
"Every nonfree program has a lord, a master -- |
63 |
and if you use the program, he is your master." Richard Stallman |
64 |
|
65 |
-- |
66 |
gentoo-amd64@g.o mailing list |