1 |
On 11/30/06, Vladimir G. Ivanovic <vgivanovic@×××××××.net> wrote: |
2 |
> I have done nothing to my hardware and I've seen this error, oh, a |
3 |
> half a dozen times, the last time 3 months (?) ago. I ran memtest when |
4 |
> I installed new memory, and it did not report problems even when run |
5 |
> for hours. |
6 |
|
7 |
memtest is basically useless these days. It can only tell you if you |
8 |
have a bad memory cell, which almost never happens today. Most memory |
9 |
problems are the result of timing issues between the processor(s) and |
10 |
DMA controllers. |
11 |
|
12 |
This script [1] seems to be a much better memory test for modern |
13 |
systems, although you may have to make some tweaks to run it on |
14 |
Gentoo. |
15 |
|
16 |
> And I do not get random segfaults with other programs. |
17 |
|
18 |
Yes, compiling is very unique in this regard. The memory access |
19 |
pattern of a compiler, reading and writing to locations on different |
20 |
rows, or even different modules, under high CPU load and using lots of |
21 |
memory, with some IO thrown in for good measure, tends to reveal |
22 |
hardware problems quite nicely. |
23 |
|
24 |
> Finally, I don't think my hardware fixed itself. |
25 |
> |
26 |
> Given all of this, my suspicion is that these errors are software |
27 |
> bugs, not hardware problems. |
28 |
|
29 |
If we were talking about a driver, or an event-based GUI program, I |
30 |
might agree. But a compiler is going to take the exact same actions |
31 |
given the same input and options. The compiler isn't going to do |
32 |
something different between 2 different executions over the _exact_ |
33 |
same sources because it feels like it. |
34 |
|
35 |
> |
36 |
> The other thing that I don't really believe is the part about "this |
37 |
> bug not being reproducible" as reported by portage/emerge/make/gcc. |
38 |
|
39 |
Then you should read the gcc sources. One of the patches applied by |
40 |
Gentoo adds a retry loop when the compiler is about to exit with an |
41 |
internal compiler error (ICE). It retries the compile twice, and if |
42 |
either of those succeeds, you get the "The bug is not reproducible" |
43 |
message. It doesn't output anything because that would possibly |
44 |
obscure the original error. |
45 |
|
46 |
The gentoo devs probably added this loop to avoid more duplicates of [2]. |
47 |
|
48 |
-Richard |
49 |
|
50 |
[1] http://people.redhat.com/dledford/memtest.html |
51 |
[2] http://bugs.gentoo.org/show_bug.cgi?id=20600 |
52 |
-- |
53 |
gentoo-user@g.o mailing list |