Gentoo Archives: gentoo-user

From: Alan McKinnon <alan.mckinnon@×××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Re: monit and friends.
Date: Mon, 16 Oct 2017 15:55:14
Message-Id: e57aa55a-c651-2c39-34a8-ab13a4a981f1@gmail.com
In Reply to: Re: [gentoo-user] Re: monit and friends. by Mick
1 On 16/10/2017 17:41, Mick wrote:
2 > On Monday, 16 October 2017 16:12:53 BST Alan McKinnon wrote:
3 >> On 16/10/2017 17:08, Ian Zimmerman wrote:
4 >>> On 2017-10-16 14:11, Alan McKinnon wrote:
5 >>>> My needs here are pretty simple:
6 >>>> local watchdog that checks if a program is running and restart it if
7 >>>> not. If that fails 3 times or so, alert me.
8 >>>> Maybe a few file/dir/fifo monitors as well. Not much else.
9 >>>>
10 >>>> I don't need any of monit's graphing features or M/monit, I have other
11 >>>> tools for that. And mostly don't even need it's http API either.
12 >>>
13 >>> supervisor (aka supervisord)
14 >>>
15 >>> http://supervisord.org/
16 >>>
17 >>> python based, not sure if that's okay with you
18 >>
19 >> I forgot about supervisord. Like monit, it runs everywhere and might be
20 >> easier for the team-mates to understand and work with.
21 >>
22 >> Python is not a problem, all these hosts are ansible-managed anyway, so
23 >> they all have to run python-2.7
24 >>
25 >> Good find, thanks!
26 >
27 > I've used Nagios in the past, but have not kept up with its development and
28 > the many plugins it provides. It could do any of the above tasks and much
29 > more. It can run scripts (perl, or bash) via daemons (nrpe) on the remote
30 > systems to restart applications, et al. The Nagios server possessed the
31 > ability to set up quite intelligent monitoring and alert hierarchies with
32 > multilayered comms structures to make sure you are not woken up at 2 a.m. by
33 > your boss, just because a ping failed to his home NAS. I also found the logs
34 > which can be also stored on SQL quite useful both in troubleshooting problems
35 > and in producing reports. It can monitor network connectivity, remote OS
36 > parameters and applications. Writing your own plugin/module to monitor quite
37 > specialised use cases is not particularly difficult either.
38 >
39 > I expect you may find Nagios more complicated to set up than monit, at least
40 > initially, but if you don't have the luxury of time to invest on setting up
41 > Nagios monit may be a better fit. I don't have in depth experience with other
42 > monitoring software to comment, so something else may suit better your
43 > specific needs.
44 >
45
46
47 Nagios and I go way back, way way waaaaaay back. I now recommend it
48 never be used unless there really is no other option. There is just so
49 many problems with actually using the bloody thing, but let's not get
50 into that :-)
51
52 I have a full monitoring system that tracks and reports on the state of
53 most things, but as it's a monitoring system it is forbidden to make
54 changes of any kind at all, and that includes restarting failed daemons.
55 Turns out that daemons that failed for no good reason are becoming more
56 and more common in this day and age, mostly because we treat them like
57 cattle not pets and use virtualization and containers so much. And
58 there's our old friend the Linux oom-killer....
59
60 What I need here is a small app that will be a constrained,
61 single-purpose watchdog. If a daemon fails, the watchdog attempts 3
62 restarts to get it going, and records the fact it did it (that goes into
63 the big monitoring system as a reportable fact). If the restart fails,
64 then a human needs to attend to it as it is seriously or beyond the
65 scope of a watchdog.
66
67 Like you, I'm tired of being woken at 2am because something dropped 1
68 ping when the nightly database maintenance fired up on the vmware
69 cluster :-)
70
71
72 --
73 Alan McKinnon
74 alan.mckinnon@×××××.com

Replies

Subject Author
Re: [gentoo-user] Re: monit and friends. Ralph Seichter <m16+gentoo@×××××××××××.net>
Re: [gentoo-user] Re: monit and friends. Michael Orlitzky <mjo@g.o>
Re: [gentoo-user] Re: monit and friends. skyclan@×××.net