1 |
On Wed 17 August 2011 15:08:09 kashani did opine thusly: |
2 |
> On 8/17/2011 2:43 PM, Alan McKinnon wrote: |
3 |
> > I'm just itching to type up the long list of horror stories I've |
4 |
> > stored from people doing their own DNS thinking it was real |
5 |
> > easy. |
6 |
> > |
7 |
> > But there's this little thing called an NDA and it says I can't |
8 |
> > :-( |
9 |
> heh, I think I can dredge one up for you that no one will care about |
10 |
> these days. |
11 |
> |
12 |
> This was at a large ISP in '99 known for their free Internet. |
13 |
|
14 |
I'm glad you detailed that story, now I know I'm not the only one :-) |
15 |
|
16 |
Long long ago (in the 90s) when a current colleague started working |
17 |
here, he wanted access to the hidden primary (like your ns00). |
18 |
|
19 |
He was given a bare machine (no OS) with these instructions: |
20 |
|
21 |
It's 10am, by 4pm I want a name server running on that hardware, |
22 |
authoritative for domain xxx.yyy.zzz, live on the internet, with |
23 |
firewall installed and all reasonable security precautions taken. You |
24 |
do not have to register xxx.yyy.zzz with any registrar, we will test |
25 |
it with "dig @". |
26 |
|
27 |
He passed :-) |
28 |
|
29 |
The same fellow 3 years later found one day that the company zone had |
30 |
not loaded after an update (the name servers are self-hosted in that |
31 |
zone) and the support person that did it had done it twice before |
32 |
recently. Ten minutes later an ACL was in place and only systems could |
33 |
edit the zone. The entire company was told to propose sub-domains for |
34 |
their own teams and systems would delegate them - the uproar was |
35 |
fantastic but he stood his ground. He was 100% right of course and we |
36 |
still benefit today. |
37 |
|
38 |
Lessons learned: |
39 |
- do not ever mess with your DNS admin |
40 |
- $DEITY says "sir" in hushed tones when addressing the dns admin |
41 |
|
42 |
|
43 |
> Bind |
44 |
> 8 was fresh on the scene and somehow Network Engineering was in |
45 |
> charge of DNS rather than Systems. My intern and I came up with a |
46 |
> plan to have ns00.int as the internal master and make the rest of |
47 |
> name servers slave off of it. All ns00 did was supply the |
48 |
> production name servers with zones. |
49 |
> |
50 |
> ns00 --> ns01(vip) --> ns01-[01-03] |
51 |
> \--> ns02(vip) --> ns02-[01-03] |
52 |
> \-> ns03(vip) --> ns03-[01-03] |
53 |
> |
54 |
> Three virtual IPs and three name servers behind each vip. |
55 |
> |
56 |
> This way we could have tools deal with updating zones on ns00 on the |
57 |
> internal network and not have to push to a number of name servers. |
58 |
> This worked well for a few months and we generally forgot about it. |
59 |
> Almost a month after a reorganization in the local datacenter DNS |
60 |
> went down. Well not down down, but our zones weren't working. After |
61 |
> a hectic hour of freaking out, troubleshooting random things, and |
62 |
> bouncing from machine to machine by IP address because none of DNS |
63 |
> worked we realized our mistake. The TTL of the zone itself was set |
64 |
> to three weeks. In the move Bind had silently died on ns00 which we |
65 |
> didn't monitor because it was inside the corp network. The slaves |
66 |
> dutifully stayed up and working till they hit the TTL of the zones |
67 |
> and demanded to speak to the master again. Restarting Bind on the |
68 |
> prod servers did nothing other than remove the already expired |
69 |
> cache. |
70 |
> Once restarted Bind on ns00 (and made it part of the runlevel) |
71 |
the |
72 |
> prod server checked in and all was well. |
73 |
> |
74 |
> The lessons: |
75 |
> Monitor *all* of your DNS infrastructure |
76 |
> DNS can break even with a large distributed system and it is |
77 |
never |
78 |
> pretty. |
79 |
> |
80 |
> kashani |
81 |
-- |
82 |
alan dot mckinnon at gmail dot com |