1 |
On 29-03-2015 12:45 PM, Mick wrote: |
2 |
> On Sunday 29 Mar 2015 16:42:10 Sebas Pedersen wrote: |
3 |
>> On 28-03-2015 08:50 PM, Mick wrote: |
4 |
>> > On Saturday 28 Mar 2015 22:48:48 Sebas Pedersen wrote: |
5 |
>> >> On 28-03-2015 07:37 PM, Volker Armin Hemmann wrote: |
6 |
>> >> > Am 28.03.2015 um 23:00 schrieb Sebas Pedersen: |
7 |
>> >> >> On 28-03-2015 06:45 PM, Volker Armin Hemmann wrote: |
8 |
>> >> >>> Am 28.03.2015 um 14:58 schrieb Sebas Pedersen: |
9 |
>> >> >>>> Hi guys, |
10 |
>> >> >>>> |
11 |
>> >> >>>> From a few days ago I am experimenting an MCE error. |
12 |
>> >> >>>> Sometimes I turn on the computer and at some point while booting |
13 |
>> >> >>>> the kernel (after the grub menu) just freezes and puts this: |
14 |
>> >> >>>> |
15 |
>> >> >>>> CPU 0: Machine Check Exception: 4 Bank 4: b200000000070f0f |
16 |
>> >> >>>> TSC f5acc9180 |
17 |
>> >> >>>> PROCESSOR 2:20fc2 TIME 1427486735 SOCKET 0 APIC 0 microcode 0 |
18 |
>> >> >>>> |
19 |
>> >> >>>> the number for TSC may vary, but the b200000000070f0f it's always |
20 |
>> >> >>>> the |
21 |
>> >> >>>> same (at least for now). The error message suggest to parse the |
22 |
>> >> >>>> above |
23 |
>> >> >>>> error with mcelog. I did that and the result was: |
24 |
>> >> >>>> |
25 |
>> >> >>>> Hardware event. This is not a software error. |
26 |
>> >> >>>> CPU 0 4 northbridge TSC f5acc9180 |
27 |
>> >> >>>> TIME 1427486735 Fri Mar 27 17:05:35 2015 |
28 |
>> >> >>>> |
29 |
>> >> >>>> Northbridge Watchdog error |
30 |
>> >> >>>> |
31 |
>> >> >>>> bit57 = processor context corrupt |
32 |
>> >> >>>> bit61 = error uncorrected |
33 |
>> >> >>>> |
34 |
>> >> >>>> bus error 'generic participation, request timed out |
35 |
>> >> >>>> |
36 |
>> >> >>>> generic error mem transaction |
37 |
>> >> >>>> generic access, level generic' |
38 |
>> >> >>>> |
39 |
>> >> >>>> STATUS b200000000070f0f MCGSTATUS 4 |
40 |
>> >> >>>> CPUID Vendor AMD Family 15 Model 44 |
41 |
>> >> >>>> SOCKET 0 APIC 0 microcode 0 |
42 |
>> >> >>>> |
43 |
>> >> >>>> The error suggest it's a hardware problem. I replace de RAM with no |
44 |
>> >> >>>> luck. Same error keeps happening. |
45 |
>> >> >>>> |
46 |
>> >> >>>> Any suggestion for identifying the problem or how to procede? |
47 |
>> >> >>>> |
48 |
>> >> >>>> Many thanks in advance! |
49 |
>> >> >>>> |
50 |
>> >> >>>> Sebas |
51 |
>> >> >>> |
52 |
>> >> >>> bios update/microcode update. A google search suggests that you have |
53 |
>> >> >>> run |
54 |
>> >> >>> into an errata. |
55 |
>> >> >> |
56 |
>> >> >> Oh OK, thank you. Must have miss that in the search. So you are |
57 |
>> >> >> saying that the error comes from a bios errata (and don't know what |
58 |
>> >> >> microdode is), and the fix is to update bios? |
59 |
>> >> > |
60 |
>> >> > no, possibly from a CPU errata and a bios update might bring in the |
61 |
>> >> > microcode update that works around that. |
62 |
>> >> |
63 |
>> >> I see, thanks for clarifying that. So looks like not too many options, |
64 |
>> >> either try to update the bios and/or replace the CPU. |
65 |
>> >> |
66 |
>> >> I really appreciated you replys and time. |
67 |
>> >> |
68 |
>> >> Thanks!, |
69 |
>> >> Sebas |
70 |
>> > |
71 |
>> > There's 'CONFIG_MICROCODE=y' and friends in the kernel which along with |
72 |
>> > sys- |
73 |
>> > apps/microcode-ctl will load what ever is the latest Intel/AMD CPU code |
74 |
>> > (firmware) to patch any bugs with instructions that the CPU |
75 |
>> > manufacturers have |
76 |
>> > discovered. |
77 |
>> |
78 |
>> That's nice. I'm gonna compile the kernel and see what happends. |
79 |
>> |
80 |
>> Many thanks! |
81 |
> |
82 |
> Don't forget to enable the relevant module for your type of CPU. |
83 |
|
84 |
You're right. Thanks for the reminder! |
85 |
|
86 |
Best Regards, |
87 |
Sebas |