1 |
A friend of mine told me, that AMD also had some trouble concerning TLB |
2 |
on that architecture (translation lookaside buffer). |
3 |
Unfortunatelly I have no references for that issue. |
4 |
|
5 |
I would keep a eye on that error, and if your system must be |
6 |
highly-available, i would even change hardware. |
7 |
|
8 |
Regards, |
9 |
|
10 |
-- |
11 |
Ralf |
12 |
|
13 |
On 09/24/13 10:01, Grant wrote: |
14 |
>> I had a deeper look into the kernel sources: |
15 |
>> |
16 |
>> Your error message is exactly thrown by |
17 |
>> static bool k8_mc1_mce(u16 ec, u8 xec) |
18 |
>> |
19 |
>> So probably you have a K8 ;-) |
20 |
>> |
21 |
>> Have a look at: |
22 |
>> http://www.redhat.com/archives/rhelv5-list/2007-October/msg00075.html |
23 |
> I read it, that one sounds like a correctable ECC RAM error. |
24 |
> |
25 |
>> It *might* be an error concerning ECC error correction. Did you recently |
26 |
>> change any hardware? |
27 |
> No hardware changed in a very long time. |
28 |
> |
29 |
>> Could you attach your /proc/cpuinfo? |
30 |
> Sure, I've attached it. I'm changing hosts and machines shortly and |
31 |
> I've only seen this error once so I'm thinking I don't need to take |
32 |
> action. |
33 |
> |
34 |
> - Grant |
35 |
> |
36 |
> |
37 |
>>> I share this opinion. |
38 |
>>> The message says - even if the error was corrected - that there's |
39 |
>>> something dramatically wrong with your - i suppose - CPU. |
40 |
>>> "Corrected error" might imply, that some low-level feature got disabled |
41 |
>>> in order to prevent furher errors. |
42 |
>>> |
43 |
>>> Does this error appear only once at early boot or frequently? |
44 |
>>> |
45 |
>>> Regards, |
46 |
>>> -- |
47 |
>>> Ralf |
48 |
>>> |
49 |
>>> On 09/23/13 22:07, Volker Armin Hemmann wrote: |
50 |
>>>> Am 23.09.2013 20:59, schrieb Paul Hartman: |
51 |
>>>>> On Mon, Sep 23, 2013 at 1:45 PM, Grant <emailgrant@×××××.com> wrote: |
52 |
>>>>>> Can anyone tell me how to decipher this which has appeared in dmesg? |
53 |
>>>>>> Google wasn't very helpful. |
54 |
>>>>>> |
55 |
>>>>>> [Hardware Error]: MC1 Error: Copyback Parity/Victim error. |
56 |
>>>>>> [Hardware Error]: Error Status: Corrected error, no action required. |
57 |
>>>>>> [Hardware Error]: CPU:3 (10:2:3) MC1_STATUS[-|CE|-|-|-]: 0x9000000000000171 |
58 |
>>>>>> [Hardware Error]: cache level: L1, tx: INSN, mem-tx: EV |
59 |
>>>>> Looks like machine check error, it detected an error in the L1 cache |
60 |
>>>>> on your CPU. |
61 |
>>>>> |
62 |
>>>>> Since it says "Corrected error, no action required" I would not worry |
63 |
>>>>> about it. If that makes you feel any better. :) |
64 |
>>>>> |
65 |
>>>>> |
66 |
>>>> since those errors are rare, I would worry about it. |