1 |
"Peter Davoust" <worldgnat@×××××.com> posted |
2 |
7c08b4dd0705132304h5eccea49k22513343959aff52@××××××××××.com, excerpted |
3 |
below, on Mon, 14 May 2007 02:04:30 -0400: |
4 |
|
5 |
> I agree, it could be the heat, and that was the first thing that came to |
6 |
> my mind, but Vista boots and runs for long periods of time with no |
7 |
> issues. I'll check it out with the new kernel in the morning and see |
8 |
> what it does. |
9 |
|
10 |
Note that Gentoo tends to use hardware to its limits rather more than |
11 |
most OSs, MSWormOS and other Linux distributions alike. Vista is so new, |
12 |
and /does/ stress at least the video hardware rather more (if aero is on, |
13 |
anyway), so I don't know if anyone can rightly say with it, but certainly |
14 |
with older MS platforms, it hasn't been uncommon at /all/ for Gentoo to |
15 |
cause problems where MS didn't, and even other Linux distributions didn't. |
16 |
|
17 |
Part of the reason is that Gentoo tends to be compiled/optimized for the |
18 |
specific CPU it's running on, so it makes more efficient use of it, |
19 |
including use of functionality distributions (and MS) compiled for use on |
20 |
generic hardware simply don't use, plus simply the fact that when the CPU |
21 |
is busy, it's often getting more done in the same time, so it IS working |
22 |
harder and therefore stressing out the hardware more. |
23 |
|
24 |
Anyway, just because another OS doesn't have problems on a computer |
25 |
doesn't mean Gentoo won't, and there are quite a number of folks on the |
26 |
forums and on the gentoo-user list that will tell you the same thing -- |
27 |
learned from hard experience. |
28 |
|
29 |
Meanwhile, you mention specifically that one of the crashes was during a |
30 |
bz2 decompress. As someone who has HAD memory issues in the past, I can |
31 |
DEFINITELY tell you that bz2 DOES often trigger memory errors, if |
32 |
ANYTHING will! If the issues with BZ2 turn out to be common, CHECK THAT |
33 |
MEMORY, and check it again! You mentioned you have 2 gigs. Hopefully |
34 |
it's in the form of 2 or more sticks. If so, you should be able to take |
35 |
part of it out and see if the problem persists. Then test the other |
36 |
memory. If the problem happens with one set but not the other, you have |
37 |
your problem. Do note, however, that just because the problem continues |
38 |
to occur with either memory set doesn't necessarily mean it's not the |
39 |
memory, particularly if they are the same brand and size, purchased from |
40 |
the same place at the same time, so are likely in the same lot. |
41 |
|
42 |
In my case, I had purchased generic memory that couldn't quite do its |
43 |
rated pc3200 (clock at 200 MHz x 2, since it was DDR). I ran memtest and |
44 |
it passed with flying colors, because the memory worked fine, and memtest |
45 |
apparently doesn't really stress the memory timings, only testing the |
46 |
memory cells. However, I was crashing in operation, sometimes just the |
47 |
app, sometimes the entire kernel would panic. I turned on the kernel's |
48 |
MCE (machine check exception) reporting, and the memory was indeed the |
49 |
problem (google MCEs, there's an app available that you can run, feeding |
50 |
it the numbers, and it'll spit out the error in English), only wasn't |
51 |
quite sure whether it was the memory itself, or the mobo, causing |
52 |
perfectly good memory to generate errors upon data delivery because it |
53 |
couldn't reliably get the data to the CPU. |
54 |
|
55 |
While I didn't have the necessary BIOS settings at the time, sometime |
56 |
later a BIOS update gave me additional memory settings, and I found that |
57 |
reducing the memory timings by a single notch, to 183 MHz (DDR doubled to |
58 |
366), effectively PC3000 memory, did the trick. I was even able to tweak |
59 |
some of the individual wait-state settings to get back a bit of the |
60 |
performance I lost with the under-clocking. The memory and entire |
61 |
machine was rock-stable at the 183 MHz PC3000 memory setting. |
62 |
|
63 |
Later I upgraded from my then two 512 MB sticks to four 2 GB sticks, 8 |
64 |
gigs memory total. It was indeed the memory, not the board, as the new |
65 |
memory was just as stable at PC3200 as the old memory had been at the |
66 |
under-clocked PC3000 speed. |
67 |
|
68 |
Anyway, the way bzip2 works is apparently extremely stressful on memory, |
69 |
as more than anything else, that would trigger the errors. Compiles were |
70 |
frustrating too, but sometimes I could compile for quite some time |
71 |
without issues. That's why I didn't think it was the CPUs even before I |
72 |
got the program to read the MCE numbers and tell me what they were. They |
73 |
confirmed, it was memory related, the errors were on data as the CPU got |
74 |
it. I just didn't know until I actually changed memory whether it was |
75 |
the mobo generating errors on the data in transit, or the memory itself. |
76 |
It turned out to be the memory. |
77 |
|
78 |
-- |
79 |
Duncan - List replies preferred. No HTML msgs. |
80 |
"Every nonfree program has a lord, a master -- |
81 |
and if you use the program, he is your master." Richard Stallman |
82 |
|
83 |
-- |
84 |
gentoo-amd64@g.o mailing list |