1 |
Eric Brown wrote: |
2 |
|
3 |
> Services that use Gentoo init scripts often report a status of [started] or |
4 |
> |
5 |
> [OK] even though they fail to start. The most recent bug like this that I've |
6 |
> |
7 |
> found is with snort. If you have a bad rule, snort will initialize, the |
8 |
> |
9 |
> rc-scripts will give it an [OK] status, and then it will die once it parses the |
10 |
> |
11 |
> rules. |
12 |
> |
13 |
> |
14 |
> |
15 |
> The real problem is not that the daemons don't return errors, but that our init |
16 |
> |
17 |
> scripts do not make reasonable attempts to verify service startup. If a Gentoo |
18 |
> |
19 |
> init script claims that a service started, it should make an effort to check |
20 |
> |
21 |
> that the processes are actually running shortly after the script is run, even if |
22 |
> |
23 |
> start-stop-daemon says the parent process initialized. Relying on the return |
24 |
> |
25 |
> value of start-stop-daemon is simply insufficient for some services. |
26 |
> |
27 |
> |
28 |
> |
29 |
> I am aware that there are services that can monitor the status of other services |
30 |
> |
31 |
> (app-admin/mon?) but I think this issue is a little different. If an ebuild |
32 |
> |
33 |
> developer is aware of an error condition can commonly occur shortly after a |
34 |
> |
35 |
> daemon initializes, why not attempt to catch those errors? Most of them could |
36 |
> |
37 |
> probably be caught by simply checking to see if the process is still running |
38 |
> |
39 |
> shortly after the script is run. |
40 |
> |
41 |
> |
42 |
> |
43 |
> I propose increasing developer awareness of this problem, perhaps through some |
44 |
> |
45 |
> formal guidelines for ebuild developers. At the very least, I would like to see |
46 |
> |
47 |
> these bugs being acknowledged in bugs.gentoo.org instead of getting the same old |
48 |
> |
49 |
> upstream/it's not our fault response. We are responsible for our init scripts, |
50 |
> |
51 |
> and they are important to our users. |
52 |
> |
53 |
> |
54 |
> |
55 |
> I have 2 ideas for the actual implementation: |
56 |
> |
57 |
> |
58 |
> |
59 |
> 1) Some kind of check() function in the init.d script, or a generic check() function |
60 |
> |
61 |
> that just checks with ps | grep. This might typically be called after having the |
62 |
> |
63 |
> init script sleep for a certain amount of time. |
64 |
> |
65 |
> |
66 |
> |
67 |
> 2) Some kind of special init script that checks registered daemons after all services |
68 |
> |
69 |
> have started. (i.e. it depends on all daemons, or they are put into it’s config file). |
70 |
> |
71 |
> With this scheme we could avoid excessive sleeping during startup (to keep it fast), |
72 |
> |
73 |
> And perhaps even keep using service specific check() functions |
74 |
> |
75 |
> |
76 |
> |
77 |
> Does anyone else think this idea is worth looking into? |
78 |
> |
79 |
|
80 |
http://bugs.gentoo.org/show_bug.cgi?id=90471 |
81 |
|
82 |
We managed this checking for the socket mysql always create on *nix . |
83 |
But whit a timeout of five seconds if there is no error message nor |
84 |
socket in that time the script assume the server started. |
85 |
I'm the first to say that this need to be improved but it's a start. |
86 |
|
87 |
|
88 |
-- |
89 |
gentoo-dev@g.o mailing list |