1 |
Eric Brown wrote: |
2 |
> It's not just init scripts that are designed poorly, it's applications |
3 |
> that fail to daemonize and dont' return non-zero. (like apache-1.3, |
4 |
> ntp-date, snort, etc.. most servers). |
5 |
Yeah, that is the real problem, but that is not fixable immediately. |
6 |
|
7 |
> I think the new baselayout has some adjustable sleep/checks on by |
8 |
> default, but I haven't tries it yet. I suppose that would work by |
9 |
> having the init script sleep for a short time, then check to see if a |
10 |
> certain process is running (a dirty hack that's frowned updon by |
11 |
> some)... |
12 |
> |
13 |
> Anyway, I really think this is something worth looking into, either as |
14 |
> a separate package that implements some kind of monitoring functions |
15 |
> in init scripts, or as a basic set of monitoring functions that are |
16 |
> included in baselayout for developers to optionally use. |
17 |
|
18 |
> While there are options like daemontools and rmon? I don't think they |
19 |
> provide a very robust solution to this problem. |
20 |
Can you elaborate on this? What is wrong with daemontools? |
21 |
|
22 |
http://cr.yp.to/daemontools/faq/create.html#why |
23 |
|
24 |
> Here's an idea: |
25 |
> |
26 |
> Have users emerge a program that can do some basic things: |
27 |
emerge daemontools |
28 |
|
29 |
|
30 |
> 1) check config scripts of running services for variables like: |
31 |
> CHECK_COMMAND |
32 |
svstat /service/daemon |
33 |
|
34 |
> SLEEP_TIME |
35 |
If I get you right, this is hard coded to 1 second for supervise. |
36 |
|
37 |
> INIT_SLEEP |
38 |
If I get you right, this is hard coded to 1 second for supervise. |
39 |
|
40 |
> IF_DOWN_DO_THIS |
41 |
Not sure how important is this, but flexibility can be achieved inside the ./run script of the |
42 |
daemon. The normal thing is to start it again. |
43 |
svc -u /service/daemon |
44 |
|
45 |
> (those variables could optionally be implemented per package, or |
46 |
> explicitly disabled, they would have sane defaults in a central .conf |
47 |
> file for this enhancement package) |
48 |
This can be worked on, the start is sys-process/daemontools-scripts. |
49 |
|
50 |
> 2) is a cron job/daemon that automatically checks all running apps in |
51 |
> the current runlevel, can report problems (send emails, log stuff, |
52 |
> etc), can send heartbeat, etc.. |
53 |
Can you trust cron, started from a init script? |
54 |
Instead a script started under supervise, as simple as this: |
55 |
#!/bin/bash |
56 |
|
57 |
sleep 60; |
58 |
`svstat /service/* |grep down >/dev/null` && \ |
59 |
echo "Something is wrong" | your_favourite_mailer_here |
60 |
|
61 |
(The above is not tested, just thought of. Feel free to improve). |
62 |
(Like there can be intentionally down-ed services -e /service/daemon/down) |
63 |
|
64 |
> 3) is well documented in the gentoo handbook since it's probably a |
65 |
> vital component |
66 |
http://cr.yp.to/daemontools.html is a good start, in can be improved for sure. |
67 |
|
68 |
> To me this kind of thing is probably simple, unixish, and robust |
69 |
> enough for most of our needs... |
70 |
> |
71 |
> Any thoughs? |
72 |
Don't top-post :-) |
73 |
|
74 |
Kalin. |
75 |
/known also as Korokoro or tar/ |
76 |
|
77 |
P.S. Whose MTA changes the subject with [OBORONA-SPAM] ? Please remove that; use a custom header |
78 |
instead. |
79 |
|
80 |
-- |
81 |
|[ ~~~~~~~~~~~~~~~~~~~~~~ ]| |
82 |
+-> http://ThinRope.net/ <-+ |
83 |
|[ ______________________ ]| |
84 |
|
85 |
-- |
86 |
gentoo-server@g.o mailing list |