Note: Due to technical difficulties, the Archives are currently not up to date.
GMANE provides an alternative service for most mailing lists. c.f. bug 424647
List Archive: gentoo-server
Sébastien Arnaud wrote:
> I have started to "pour" more Gentoo Linux based server in a datacenter
> over the past year, I lost control 3 times of remote servers. One of
> them was after a hard reboot and filesystem check which required to
> press a key on the physical machine,
This is controlled by your fstab, the last column change to a 0 This
will stop fsck from running on boot, but can make recovering a partition
trickey depending on how you have your disk sliced. This is where the
old school argument of multiple partitions comes into play, to each
their own though so don't flame me for mentioning it.
and the two remaining ones were
> linked to SSH terminating the connection after running some updates.
> So, I wanted to get some advice on how you all handle keeping control of
> your remote Gentoo servers, and for instance how to keep SSH running at
> all costs.
I would probably write a quick and dirty bash script to cron and check
it. Also, you could possibly get tricky with a nagios style plugin and
actually check the connection and not just a running process.
There may be something out there that does this. SIM may have something
in it, I can't recall if sshd was in the default checklist or not. I
usually just sit down and hack something out when needed.
Yet another possiblity is running a back door for yourself, a seperate
sshd on another port. But, I probably wouldn't go this far. One more
thing to maintain and watch.
You could also create a new service through xinetd that resets sshd very
easily. Just make sure you lock it down to a trusted host ;)
Get creative, the more I think about it the more ways come to mind.
> I have seen in different FAQs that running a serial cable to each server
> and using a SSH serial console switch is a good idea, but I am having
> trouble finding something cheap in this arena.
This is a PITA IMHO (having to manage hundreds of machines in a DC myself)
KVM over IP is another solution, but costly and a PITA to maintain the
cabling over time on larger networks.
Also, how much better is
> it in terms of reliability in case something goes really wrong with the
> server? FYI, all the servers are plugged into a remote APC reboot switch
> but I almost never use this, as many times it ends up invalidating the
> filesystem and therefore requiring a physical intervention at the
> keyboard. Anyway around this problem as well?
Changing the fstab will help with this somewhat. It runs for a reason,
but sometimes getting it up matters most.
Just my opinions,
Rob
|
|