Gentoo Archives: gentoo-dev

From: Chris Gianelloni <wolf31o2@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] init script guidelines
Date: Tue, 19 Jul 2005 17:24:54
Message-Id: 1121793750.26224.8.camel@cgianelloni.nuvox.net
In Reply to: [gentoo-dev] init script guidelines by Eric Brown
1 On Tue, 2005-07-19 at 12:42 -0400, Eric Brown wrote:
2 > Services that use Gentoo init scripts often report a status of [started] or
3 > [OK] even though they fail to start. The most recent bug like this that I've
4 > found is with snort. If you have a bad rule, snort will initialize, the
5 > rc-scripts will give it an [OK] status, and then it will die once it parses the
6 > rules.
7
8 So snort shouldn't be giving the OK until it really is OK.
9 >
10 > The real problem is not that the daemons don't return errors, but that our init
11 > scripts do not make reasonable attempts to verify service startup. If a Gentoo
12 > init script claims that a service started, it should make an effort to check
13 > that the processes are actually running shortly after the script is run, even if
14 > start-stop-daemon says the parent process initialized. Relying on the return
15 > value of start-stop-daemon is simply insufficient for some services.
16
17 Not really. An init script is simply a script. It doesn't guarantee
18 anything other than what the service told it. If a service is returning
19 status codes when it really isn't completed its initialization, that is
20 a bug in that service, not in the init script code. While code might
21 need to be adjusted in the init script, this will most likely require
22 patches to the upstream sources.
23 >
24 > I am aware that there are services that can monitor the status of other services
25 > (app-admin/mon?) but I think this issue is a little different. If an ebuild
26 > developer is aware of an error condition can commonly occur shortly after a
27 > daemon initializes, why not attempt to catch those errors? Most of them could
28 > probably be caught by simply checking to see if the process is still running
29 > shortly after the script is run.
30
31 I agree with you that we should catch the errors, but running another
32 check is simply a waste of time. The service should not ever show a
33 completed state until it is completed. It shouldn't ever be like "Yes,
34 snort worked.......... oh wait, no it didn't." That is even more
35 confusing for users.
36
37 > I propose increasing developer awareness of this problem, perhaps through some
38 > formal guidelines for ebuild developers. At the very least, I would like to see
39 > these bugs being acknowledged in bugs.gentoo.org instead of getting the same old
40 > upstream/it's not our fault response. We are responsible for our init scripts,
41 > and they are important to our users.
42
43 You really need to take this up with the developers in question, as this
44 is not a global matter, but really a matter with specific packages.
45 Those are bugs in those packages. If the ebuild maintainers are
46 refusing to resolve issues in the init scripts, which are definitely
47 Gentoo works, please take it up with user relations or attempt to
48 provide a fix for the problem.
49 >
50 > I have 2 ideas for the actual implementation:
51 >
52 > 1) Some kind of check() function in the init.d script, or a generic check() function
53 > that just checks with ps | grep. This might typically be called after having the
54 > init script sleep for a certain amount of time.
55
56 I would object to this. Having a function to check the status of a
57 service for all of the possible services, when it is only a few that are
58 showing this error, is a bad idea. It adds extra load on all developers
59 that have any init scripts, and is unnecessary in most cases.
60 >
61 > 2) Some kind of special init script that checks registered daemons after all services
62 > have started. (i.e. it depends on all daemons, or they are put into it’s config file).
63 > With this scheme we could avoid excessive sleeping during startup (to keep it fast),
64 > And perhaps even keep using service specific check() functions
65
66 This would require much more knowledge on the end-user's part. Plus, it
67 will need to be aware of init script dependencies. All in all, it
68 sounds like a bad patch for a situation.
69
70 --
71 Chris Gianelloni
72 Release Engineering - Strategic Lead/QA Manager
73 Games - Developer
74 Gentoo Linux

Attachments

File name MIME type
signature.asc application/pgp-signature