1 |
On Tuesday 11 May 2004 16:54, Chris Gianelloni wrote: |
2 |
> On Tue, 2004-05-11 at 15:38, Kevin wrote: |
3 |
> > Ok. Thanks for the suggestion. But what about this: Dell has a |
4 |
> > utility partition and some programs for doing exhaustive testing of |
5 |
> > all the hardware in the server. If I run the most thorough set of |
6 |
> > tests available in this utility partition and I get a clean bill of |
7 |
> > health, is that a reliable indication that there are no hardware |
8 |
> > problems? Or does memtest86 do testing that's more exhaustive than |
9 |
> > most such utility suites? |
10 |
> |
11 |
> I think the Dell suite would be more extensive. |
12 |
|
13 |
Thanks for saying so, Chris. |
14 |
|
15 |
> |
16 |
> > If the utility partition testing says all is well (I've done it |
17 |
> > several times in the last month or so, though maybe not the most |
18 |
> > extensive tests), what's the next place to look for an explanation |
19 |
> > of why this MCE is happening in Gentoo but not in SuSE? |
20 |
> |
21 |
> Are you sure that it isn't MCE *causing* these problems? Have you |
22 |
> tried turning it off and seeing if you still have the same kinds of |
23 |
> problems? |
24 |
|
25 |
I'm not sure I understand what you mean by that. The first time I got a |
26 |
kernel panic and MCE, I believe that the kernel I was running had no |
27 |
configured capability to deal with MCE errors (though I'm not sure of |
28 |
that). I had never seen an MCE before, but after this first time, with |
29 |
any other kernels I built, I searched through the .config file options |
30 |
for handlers of MCE errors and built them into the kernel where they |
31 |
were available. IIRC, then when I got a kernel panic with those |
32 |
kernels, I had some more information (apparently generated by the |
33 |
kernel) on the console than I did with the first MCE. I add this |
34 |
information in case it relates to your question or point here, but I'm |
35 |
really not sure what you mean by, "Have you tried turning it off..." |
36 |
Where do I turn it off? Do you mean the .config file parameter in the |
37 |
kernel configuration process that builds (or not) a handler for the MCE |
38 |
errors? Or do you mean something else? |
39 |
|
40 |
Honestly, I'm thinking that I may have somehow built some software |
41 |
(during the stage 1 installation process) that is causing these |
42 |
problems, but I followed the Gentoo Handbook for doing a stage 1 |
43 |
installation pretty rigidly, so I'm not sure what I might have done to |
44 |
cause that. When I did the bootstrap.sh and emerge system, I was |
45 |
running the kernel that I booted from the boot CD (2004.0 I think, and |
46 |
probably even the smp kernel that was on that CD---IIRC, the 2004.1 |
47 |
boot CD has some problems that prevent the use of the smp kernel on |
48 |
that CD). |
49 |
|
50 |
In fact, now that I think of it, I'm pretty sure I didn't get any MCE |
51 |
kernel panics until after I finished emerge system and other tasks and |
52 |
then rebooted my new Gentoo system. Perhaps this helps isolate the |
53 |
cause of the problems. While I was doing the bootstrap.sh and emerge |
54 |
system, it's definitely true that I was stressing the system out with |
55 |
lots of compile jobs (which is what has been triggering my MCEs), but |
56 |
I'm pretty sure I did not get any MCE failures during those steps. |
57 |
Does this help someone figure out what's going on in my case? |
58 |
|
59 |
Are there some compiler flags or other configurable settings that, if |
60 |
set to certain values during the bootstrap.sh or emerge system steps, |
61 |
could end up generating software (perhaps when I built my own gcc?) |
62 |
that would cause these MCEs to be thrown? |
63 |
|
64 |
Like I said in my PS in my first post, I have this vague memory of |
65 |
seeing something that said, such-and-such is not smp safe. Have no |
66 |
clue what that might have been now, though, or even if it's an accurate |
67 |
memory. Some of this work was done in the wee hours... |
68 |
|
69 |
Thanks for the replies and any other suggestions. |
70 |
|
71 |
-Kevin |
72 |
|
73 |
|
74 |
-- |
75 |
gentoo-dev@g.o mailing list |