Gentoo Archives: gentoo-user

From: "Vladimir G. Ivanovic" <vgivanovic@×××××××.net>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] SegFault while compiling gcc 4.1.1
Date: Fri, 01 Dec 2006 05:39:40
Message-Id: 456FBF28.5010109@comcast.net
In Reply to: Re: [gentoo-user] SegFault while compiling gcc 4.1.1 by Richard Fish
1 Richard Fish wrote:
2 > On 11/30/06, Vladimir G. Ivanovic <vgivanovic@×××××××.net> wrote:
3 >> I have done nothing to my hardware and I've seen this error, oh, a
4 >> half a dozen times, the last time 3 months (?) ago. I ran memtest when
5 >> I installed new memory, and it did not report problems even when run
6 >> for hours.
7 >
8 > memtest is basically useless these days. It can only tell you if you
9 > have a bad memory cell, which almost never happens today. Most memory
10 > problems are the result of timing issues between the processor(s) and
11 > DMA controllers.
12 >
13 > This script [1] seems to be a much better memory test for modern
14 > systems, although you may have to make some tweaks to run it on
15 > Gentoo.
16
17 Just for kicks I'll run the script and see what happens.
18
19 >
20 >> And I do not get random segfaults with other programs.
21 >
22 > Yes, compiling is very unique in this regard. The memory access
23 > pattern of a compiler, reading and writing to locations on different
24 > rows, or even different modules, under high CPU load and using lots of
25 > memory, with some IO thrown in for good measure, tends to reveal
26 > hardware problems quite nicely.
27 >
28 >> Finally, I don't think my hardware fixed itself.
29 >>
30 >> Given all of this, my suspicion is that these errors are software
31 >> bugs, not hardware problems.
32
33 For grins, here is part of comment #174:
34
35 Random segfaults during compilation. ... in general a sign of
36 hardware problems.
37
38 // No, this is in general a sign of GCC 4.1 - problem ;-)
39 >
40 > If we were talking about a driver, or an event-based GUI program, I
41 > might agree. But a compiler is going to take the exact same actions
42 > given the same input and options. The compiler isn't going to do
43 > something different between 2 different executions over the _exact_
44 > same sources because it feels like it.
45
46 You're right at the logical level, but not at the physical level.
47 Cache effects and different disk accesses are two physical differences
48 that spring to mind. Temporary files will be in different physical
49 sectors, or in the buffer cache or not; directories may or may not be
50 in the directory cache. Depending on what else is running, the pattern
51 of cache misses will be different.
52
53 I emerge with -j2. Plus I'm doing work while the emerges happen. The
54 likelihood of the memory access pattern of two compiles being the same
55 is precisely zero.
56
57 >
58 >>
59 >> The other thing that I don't really believe is the part about "this
60 >> bug not being reproducible" as reported by portage/emerge/make/gcc.
61 >
62 > Then you should read the gcc sources. One of the patches applied by
63 > Gentoo adds a retry loop when the compiler is about to exit with an
64 > internal compiler error (ICE). It retries the compile twice, and if
65 > either of those succeeds, you get the "The bug is not reproducible"
66 > message.
67
68 Interesting. I did not know that. But I don't get why gcc exits with
69 an error when the second (or third) try succeeds? Why not just print a
70 warning, perhaps at the end so it is noticeable? Most people will
71 restart the entire emerge, which seems like a gargantuan amount of
72 wasted effort since the re-compilation has succeeded.
73
74 > It doesn't output anything because that would possibly
75 > obscure the original error.
76 >
77 > The gentoo devs probably added this loop to avoid more duplicates of [2].
78 >
79 > -Richard
80 >
81 > [1] http://people.redhat.com/dledford/memtest.html
82 > [2] http://bugs.gentoo.org/show_bug.cgi?id=20600
83
84 --
85 gentoo-user@g.o mailing list