1 |
Casey, |
2 |
|
3 |
We've been seeing issues like this for probably the last year. I was |
4 |
never able to pinpoint it to any action. We implemented remote reboot |
5 |
hardware and called it a day. |
6 |
|
7 |
Some of them had strange activity, but over a larger group of machines I |
8 |
could never find a pattern to it. It almost seems as if it cannot spawn |
9 |
any new processes. |
10 |
|
11 |
I can't help except to say your not alone. |
12 |
|
13 |
Rob |
14 |
|
15 |
Casey Allen Shobe - SeattleServer Mailing Lists wrote: |
16 |
> Hey all, |
17 |
> |
18 |
> We're seeing occasional issues with a bunch of machines we have in a |
19 |
> datacenter, most of which are currently running Gentoo. The machines will |
20 |
> run solid and fine for days, weeks, even months, and then just lock up solid |
21 |
> - the box still pings and an nmap scan shows all the normal ports open, but |
22 |
> nothing responds on any port, nothing shows up in system logs, and the times |
23 |
> we've had console access to a machine at the time, a login prompt would show |
24 |
> up, but it would just hang if you tried to log in. |
25 |
> |
26 |
> This generally indicates hardware issues to me, but it has been happening |
27 |
> across a wide array of both well-tested and new machines. In addition, it |
28 |
> happens on machines that are running Red Hat 7.1 through 9.0 as well as |
29 |
> Gentoo. The problem seems random, and there is almost always close to zero |
30 |
> load on the machine when it locks up (only once were we presently using the |
31 |
> machine, and it locked up while uncompressing a tar file). |
32 |
> |
33 |
> The Gentoo systems use the deadline I/O scheduler as it's deemed the most |
34 |
> reliable, but this has shown up with the default anticipatory I/O scheduler |
35 |
> as well. |
36 |
> |
37 |
> The only common factor seems to be that they are all plugged into a |
38 |
> questionable HP Procurve switch that we've been contemplating replacing. |
39 |
> Would that simply be wasting our time (I don't think a buggy switch should be |
40 |
> able to lock up boxes...)? Any recommendations for what to investigate at |
41 |
> this point? |
42 |
> |
43 |
> Cheers, |
44 |
|
45 |
-- |
46 |
gentoo-server@g.o mailing list |