Gentoo Archives: gentoo-dev

From: Daniel Drake <dsd@g.o>
To: Kevin <gentoo-dev@××××××.biz>
Cc: Gentoo Dev <gentoo-dev@l.g.o>
Subject: Re: [gentoo-dev] Major MCE problem with SMP on Gentoo kernels
Date: Thu, 13 May 2004 14:55:45
Message-Id: 40A38E86.1090609@gentoo.org
In Reply to: Re: [gentoo-dev] Major MCE problem with SMP on Gentoo kernels by Kevin
1 Hi Kevin,
2
3 Kevin wrote:
4 > Greg KH thinks it's bad memory, but I'm skeptical of that because the main
5 > address that fails (some 30 times in a row) is at 1023.8MB and the Dell
6 > Utilities only test up to 1022MB, and because I haven't seen the problem
7 > with the liveCD kernel.
8
9 Although I've very rarely dealt with SMP systems, I've seen many unstable
10 systems being diagnosed by various memory testing utilites as OK. As soon as
11 you run memtest, errors come up, and replacing the faulty memory amazingly
12 brings system stability again.
13
14 If you RAM is always producing errors in the same place (and only in 1 place)
15 then you might want to google for BadMem/BadRAM. These are two flavours of
16 kernel patches which allow you to ask the kernel to ignore specific blocks of
17 memory. You can even get memtest-x86 to output the exact parameters you need
18 based on memory faults it finds. This should allow you to ignore the faulty
19 part of the memory and continue on with the remaining ~1020mb or so.
20
21 Daniel
22
23 --
24 gentoo-dev@g.o mailing list