1 |
On 18/02/2014 14:16, J. Roeleveld wrote: |
2 |
> On Tue, February 18, 2014 12:17, Alan McKinnon wrote: |
3 |
>> On 18/02/2014 11:52, J. Roeleveld wrote: |
4 |
>>> On Tue, February 18, 2014 10:47, Alan McKinnon wrote: |
5 |
>>>> What I do run into is daemons that drop privs on start up, like |
6 |
>>>> tac_plus. Unwary new sysadmins always try start/stop it as root, |
7 |
>>>> causing |
8 |
>>>> an unholy mess. Root the owns the log and pid files, when tac_plus |
9 |
>>>> drops |
10 |
>>>> privs it can't record it's state so continues to service requests but |
11 |
>>>> fails to log any of them. For an auth daemon, that's a serious issue. |
12 |
>>> |
13 |
>>> Shouldn't sysadmins use the init-scripts for that? |
14 |
>>> If done correctly, permissions should not be an issue. |
15 |
>> |
16 |
>> It's a little more complex than just that. It's an auth service and user |
17 |
>> are frequently added, removed and modified. The daemon does syntax |
18 |
>> checking on it's config file at startup or after being HUP'ed but that |
19 |
>> only finds static errors. It catches things like adding people to a grop |
20 |
>> instead of to a group, but misses dynamic mistakes like adding users to |
21 |
>> groups that don't exist. |
22 |
> |
23 |
> The auth-service gets the current state from a static file that is only |
24 |
> read upon service-start? |
25 |
|
26 |
Yes. |
27 |
|
28 |
It's a good design for reasonably static userbases. The user details, |
29 |
priviledge definitions, passwords hashes and such are stored in a single |
30 |
flat file readable only by root and protected by file permissions. |
31 |
Overall protection is provided by restricted shell access to the host. |
32 |
|
33 |
We're not talking about AT&T's radius servers for dsl users here who |
34 |
sign up on a web form - for that you would use a database backend - this |
35 |
is for the company's network support personnel who log into the backbone |
36 |
and configure the network itself. There's no rush to add new (and |
37 |
unproven...) users so this scheme suits me just fine. Yes, it has quirks |
38 |
but these no longer bother me myself, we get caught out by new sysadmins |
39 |
who have not felt that pain yet |
40 |
|
41 |
|
42 |
|
43 |
> |
44 |
>> It's exactly analogous to compile-time vs runtime errors, compilers |
45 |
>> can't catch the latter. |
46 |
>> |
47 |
>> Despite this all being run out of cron with wrapper scripts to check |
48 |
>> validity, automated additions and safety checks between all three |
49 |
>> daemons, plus being fully documented on the internal wiki and in bold |
50 |
>> blinking red caps in the login motd, people still find ways to do stuff |
51 |
>> things in an attempt to fix it. |
52 |
> |
53 |
> (OT: Does the bold blinking red caps work on all terminals? :) ) |
54 |
|
55 |
|
56 |
Um, OK, you got me there. I was exaggerating! |
57 |
|
58 |
> |
59 |
>> The daemon also tries to log these errors, by writing to a log file it |
60 |
>> has no write permissions on. |
61 |
> |
62 |
> "setuid" on the group with group-write in the umask not an option? |
63 |
|
64 |
|
65 |
Hmmm, that's worth investigating. I hadn't really considered that as I |
66 |
have an aversion to trying to use umask as a control for anything. |
67 |
|
68 |
> |
69 |
>> There is nothing I can do about the quality of sysadmins, I have no |
70 |
>> input into the HR process and damagement think cheaper is always better, |
71 |
>> including skills. What I can do, is find ways to make the software more |
72 |
>> resistant to errors than it already is. |
73 |
> |
74 |
> And only grant access permissions to these rookies once they have proven |
75 |
> they understand rule #1: If In Doubt, Call Someone Who Knows! |
76 |
|
77 |
Hah! I fought that good fight for years and fought it well. They don't |
78 |
call me the sysadmin from hell around here without good reason. And I |
79 |
did manage to get a cowboy network under control and instill respect for |
80 |
how much breakage Cisco's products can cause. |
81 |
|
82 |
It's getting harder to grant access based purely on expertise, |
83 |
especially when someone crunched the numbers. It turns out that the cost |
84 |
of fixing mistakes is far less than the cost of leaving new untrained |
85 |
people unutilized and have support tickets pile up... |
86 |
|
87 |
> |
88 |
> But yes, I fully understand the methods of HR and Damagement. |
89 |
> It is a financial mistake and risk not to include technical expertise |
90 |
> checks in the recruitement fase for technical positions. |
91 |
|
92 |
Interesting story: |
93 |
|
94 |
I once had a good shouting match with a support manager about the |
95 |
quality of his recruits. I demanded to know why he hired so many |
96 |
clueless idiots (my exact words). This manager knows me well so he just |
97 |
smiled and said "Alan, you didn't get to see the applicants we rejected. |
98 |
These are the best in the market who applied". |
99 |
|
100 |
*That* was a wake-up call of note :-) |
101 |
|
102 |
|
103 |
> |
104 |
> How much does it cost the company each time this goes wrong and someone |
105 |
> like you has to come online to fix the issue? |
106 |
> That is what Damagement needs to understand. |
107 |
|
108 |
Surprisingly, it's not too expensive. There's always one of us on duty |
109 |
or standby and outages don't continue unnoticed for long. Longest that I |
110 |
recall is 3 minutes, then the phone starts ringing non-stop. remember, |
111 |
this system is internal, it does not service customers. |
112 |
|
113 |
|
114 |
|
115 |
-- |
116 |
Alan McKinnon |
117 |
alan.mckinnon@×××××.com |