1 |
Hi, i've never experienced the same problem nor do I have a dual Opteron |
2 |
setup, but I'll try to help: |
3 |
|
4 |
Nestor Camacho III wrote: |
5 |
> It would just stop compiling. Would not freeze, I was able to do |
6 |
> anything else, but it would just stop processing what it was currently |
7 |
> compiling. When I alt f2 into another window and do a top, I would see |
8 |
> that the sh process was at 99%, and had been running for a long time. If |
9 |
> I do a kill -HUP, on the process things would continue... but I can't |
10 |
> imgaine that his is a fix for the problem. |
11 |
|
12 |
Seems like an infinite loop perhaps, but might as well be something |
13 |
else... As always, it might be software or hardware :) |
14 |
|
15 |
Do these problems reliably show up with a particular ebuild? You should |
16 |
be able to tell which ebuild it was by looking at the end of |
17 |
/var/log/emerge.log -- if yes, it might be the fault of that ebuild and |
18 |
it would be interesting to know if you can reliably cause it to spin |
19 |
forever. |
20 |
|
21 |
If not it would probably have to be kernel and/or hardware (unless |
22 |
somehow the ebuild is behaving non-deterministically). Anything special |
23 |
noted in the output of dmesg or /var/log/messages that might indicate |
24 |
faulty hardware (for example, memory)? Though hardware failure can have |
25 |
lots of reasons (memory, psu, bad/dying capacitors). You might perhaps |
26 |
want to try a few nights of memtest on it. I also don't know what |
27 |
problems bad sectors on a hdd could cause for an ebuild (maybe something |
28 |
like trying to create a file and always failing?). |
29 |
|
30 |
> Is there anyone out there that has had this problem? |
31 |
|
32 |
I also vaguely remember seeing some bugs on the kernel.org bugzilla that |
33 |
were about smp-kernels on opterons having strange random segfaults (due |
34 |
to apparently some small error in their tlb). I don't know if the fixes |
35 |
have already been folded into the current gentoo and vanilla kernels or |
36 |
if your processor is actually affected (btw, which kernel are you |
37 |
running?). Maybe someone who is running smp can comment? |
38 |
|
39 |
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND |
40 |
> *21166 root 25 0 4232 428 340 R 99.4 0.0 827:12.92 sh* |
41 |
> 21278 jedi 16 0 188m 67m 12m S 2.0 3.3 0:11.35 python |
42 |
|
43 |
I'm a bit puzzled by those two lines, did you run emerge as a regular |
44 |
user (seems like it or is that python process something else)? I don't |
45 |
really know how good running emerge is supposed to work when not being |
46 |
run as root. |
47 |
|
48 |
Good luck, |
49 |
Marco |
50 |
-- |
51 |
gentoo-amd64@g.o mailing list |