Gentoo Archives: gentoo-server

From: Robert Sanders <rob-lists@××××××××.com>
To: gentoo-server@l.g.o
Subject: Re: [gentoo-server] Server lockups (still ping) (OT because not Gentoo-specific?)
Date: Sun, 24 Apr 2005 14:43:50
Message-Id: 426BB0AB.6000706@route256.com
In Reply to: [gentoo-server] Server lockups (still ping) (OT because not Gentoo-specific?) by Casey Allen Shobe - SeattleServer Mailing Lists
1 Casey,
2
3 We've been seeing issues like this for probably the last year. I was
4 never able to pinpoint it to any action. We implemented remote reboot
5 hardware and called it a day.
6
7 Some of them had strange activity, but over a larger group of machines I
8 could never find a pattern to it. It almost seems as if it cannot spawn
9 any new processes.
10
11 I can't help except to say your not alone.
12
13 Rob
14
15 Casey Allen Shobe - SeattleServer Mailing Lists wrote:
16 > Hey all,
17 >
18 > We're seeing occasional issues with a bunch of machines we have in a
19 > datacenter, most of which are currently running Gentoo. The machines will
20 > run solid and fine for days, weeks, even months, and then just lock up solid
21 > - the box still pings and an nmap scan shows all the normal ports open, but
22 > nothing responds on any port, nothing shows up in system logs, and the times
23 > we've had console access to a machine at the time, a login prompt would show
24 > up, but it would just hang if you tried to log in.
25 >
26 > This generally indicates hardware issues to me, but it has been happening
27 > across a wide array of both well-tested and new machines. In addition, it
28 > happens on machines that are running Red Hat 7.1 through 9.0 as well as
29 > Gentoo. The problem seems random, and there is almost always close to zero
30 > load on the machine when it locks up (only once were we presently using the
31 > machine, and it locked up while uncompressing a tar file).
32 >
33 > The Gentoo systems use the deadline I/O scheduler as it's deemed the most
34 > reliable, but this has shown up with the default anticipatory I/O scheduler
35 > as well.
36 >
37 > The only common factor seems to be that they are all plugged into a
38 > questionable HP Procurve switch that we've been contemplating replacing.
39 > Would that simply be wasting our time (I don't think a buggy switch should be
40 > able to lock up boxes...)? Any recommendations for what to investigate at
41 > this point?
42 >
43 > Cheers,
44
45 --
46 gentoo-server@g.o mailing list

Replies