Gentoo Archives: gentoo-user

From: "Stefan G. Weichinger" <lists@×××××.at>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] ~amd64 : X11 (?) crashing
Date: Tue, 24 Nov 2009 22:07:00
Message-Id: 4B0C480A.7030007@xunil.at
In Reply to: Re: [gentoo-user] ~amd64 : X11 (?) crashing by Helmut Jarausch
1 Helmut Jarausch schrieb:
2 > On 24 Nov, Stefan G. Weichinger wrote:
3 >> Stefan G. Weichinger schrieb:
4 >>> Stefan G. Weichinger schrieb:
5 >>>
6 >>>> Since then no crashes, but I would have to test clicking some more stuff
7 >>>> to really believe ...
8 >>> As always, after hitting SEND ... one more crash ...
9 >> Sometimes it crashes after clicking opera, sometimes after clicking
10 >> thunderbird, so far never when clicking/starting a gnome-terminal.
11 >>
12 >> I am still looking for a pattern or an error-message somewhere ...
13 >>
14 >
15 > This reminds me of a problem we had just recently.
16 > Have you got a multi-core CPU ?
17 > If yes, read on.
18 >
19 > We have 6 machines here running an identical Gentoo system
20 > (just different hostname and IP number)
21 > with a AMD Phenom II quad core CPU and identical mother boards.
22 > One of them had these random crashes you reported.
23 > I've totured memory by running up to 3 memtester-processes
24 > over night - no single fault. Our dealer has replaced the motherboard -
25 > again no change. Then I suspected the CPU itself although it has stood
26 > a burnK7 run for several hours.
27 >
28 > After the CPU has been replaced the spook has gone.
29 > I suspect a cache coherence problem. The normal memory tests
30 > assign a given window of the physical storage to a given core -
31 > even if run in parallel. But a typical usage under Linux switches
32 > the core which executes a given thread quite frequently.
33 > Now the Phenom II has 4 core each with a private 0.5 Mb primary cache
34 > but a 6 Mb second level cache common to all 4 cores.
35 > In the BIOS one can opt for all 4 cores using this secondary cache
36 > or for only a single core using it.
37 > When a core writes to this cache or to memory all other cores must be
38 > informed that their private cache is invalid. If this doesn't happen or
39 > happens a bit too late, a core will fetch invalid (old) memory contents
40 > which may result in a crash.
41 > So, if you can, set the BIOS switch that only a single core
42 > can use the secondary cache. If the problems disappears
43 > the CPU is broken.
44
45 Phew, quite some theory ... do you positively know that this was the reason?
46
47 I think I haven't seen such a setting in my BIOS.
48
49 I use an Intel Core2Duo E6600 on a Intel DP965LT board here, 8 gigs of
50 RAM lately ...
51
52 BUT my issues really only started after completely going to ~amd64, I
53 never saw such a crash before when I used a mixed setup (most pkgs
54 stable, some unstable ...)
55
56 I will have a look at my BIOS now.
57
58 Thanks anyway for that information, greets to Aachen (from Austria) ...
59
60 Stefan