1 |
Helmut Jarausch schrieb: |
2 |
> On 24 Nov, Stefan G. Weichinger wrote: |
3 |
>> Stefan G. Weichinger schrieb: |
4 |
>>> Stefan G. Weichinger schrieb: |
5 |
>>> |
6 |
>>>> Since then no crashes, but I would have to test clicking some more stuff |
7 |
>>>> to really believe ... |
8 |
>>> As always, after hitting SEND ... one more crash ... |
9 |
>> Sometimes it crashes after clicking opera, sometimes after clicking |
10 |
>> thunderbird, so far never when clicking/starting a gnome-terminal. |
11 |
>> |
12 |
>> I am still looking for a pattern or an error-message somewhere ... |
13 |
>> |
14 |
> |
15 |
> This reminds me of a problem we had just recently. |
16 |
> Have you got a multi-core CPU ? |
17 |
> If yes, read on. |
18 |
> |
19 |
> We have 6 machines here running an identical Gentoo system |
20 |
> (just different hostname and IP number) |
21 |
> with a AMD Phenom II quad core CPU and identical mother boards. |
22 |
> One of them had these random crashes you reported. |
23 |
> I've totured memory by running up to 3 memtester-processes |
24 |
> over night - no single fault. Our dealer has replaced the motherboard - |
25 |
> again no change. Then I suspected the CPU itself although it has stood |
26 |
> a burnK7 run for several hours. |
27 |
> |
28 |
> After the CPU has been replaced the spook has gone. |
29 |
> I suspect a cache coherence problem. The normal memory tests |
30 |
> assign a given window of the physical storage to a given core - |
31 |
> even if run in parallel. But a typical usage under Linux switches |
32 |
> the core which executes a given thread quite frequently. |
33 |
> Now the Phenom II has 4 core each with a private 0.5 Mb primary cache |
34 |
> but a 6 Mb second level cache common to all 4 cores. |
35 |
> In the BIOS one can opt for all 4 cores using this secondary cache |
36 |
> or for only a single core using it. |
37 |
> When a core writes to this cache or to memory all other cores must be |
38 |
> informed that their private cache is invalid. If this doesn't happen or |
39 |
> happens a bit too late, a core will fetch invalid (old) memory contents |
40 |
> which may result in a crash. |
41 |
> So, if you can, set the BIOS switch that only a single core |
42 |
> can use the secondary cache. If the problems disappears |
43 |
> the CPU is broken. |
44 |
|
45 |
Phew, quite some theory ... do you positively know that this was the reason? |
46 |
|
47 |
I think I haven't seen such a setting in my BIOS. |
48 |
|
49 |
I use an Intel Core2Duo E6600 on a Intel DP965LT board here, 8 gigs of |
50 |
RAM lately ... |
51 |
|
52 |
BUT my issues really only started after completely going to ~amd64, I |
53 |
never saw such a crash before when I used a mixed setup (most pkgs |
54 |
stable, some unstable ...) |
55 |
|
56 |
I will have a look at my BIOS now. |
57 |
|
58 |
Thanks anyway for that information, greets to Aachen (from Austria) ... |
59 |
|
60 |
Stefan |