Gentoo Archives: gentoo-user

From: kashani <kashani-list@××××××××.net>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Running HTTP and DNS on same machine
Date: Wed, 17 Aug 2011 22:09:22
Message-Id: 4E4C3BC9.7060105@badapple.net
In Reply to: Re: [gentoo-user] Running HTTP and DNS on same machine by Alan McKinnon
1 On 8/17/2011 2:43 PM, Alan McKinnon wrote:
2 >
3 > I'm just itching to type up the long list of horror stories I've
4 > stored from people doing their own DNS thinking it was real easy.
5 >
6 > But there's this little thing called an NDA and it says I can't :-(
7
8 heh, I think I can dredge one up for you that no one will care about
9 these days.
10
11 This was at a large ISP in '99 known for their free Internet. Bind 8
12 was fresh on the scene and somehow Network Engineering was in charge of
13 DNS rather than Systems. My intern and I came up with a plan to have
14 ns00.int as the internal master and make the rest of name servers slave
15 off of it. All ns00 did was supply the production name servers with zones.
16
17 ns00 --> ns01(vip) --> ns01-[01-03]
18 \--> ns02(vip) --> ns02-[01-03]
19 \-> ns03(vip) --> ns03-[01-03]
20
21 Three virtual IPs and three name servers behind each vip.
22
23 This way we could have tools deal with updating zones on ns00 on the
24 internal network and not have to push to a number of name servers. This
25 worked well for a few months and we generally forgot about it. Almost a
26 month after a reorganization in the local datacenter DNS went down. Well
27 not down down, but our zones weren't working. After a hectic hour of
28 freaking out, troubleshooting random things, and bouncing from machine
29 to machine by IP address because none of DNS worked we realized our
30 mistake. The TTL of the zone itself was set to three weeks. In the move
31 Bind had silently died on ns00 which we didn't monitor because it was
32 inside the corp network. The slaves dutifully stayed up and working till
33 they hit the TTL of the zones and demanded to speak to the master again.
34 Restarting Bind on the prod servers did nothing other than remove the
35 already expired cache.
36 Once restarted Bind on ns00 (and made it part of the runlevel) the prod
37 server checked in and all was well.
38
39 The lessons:
40 Monitor *all* of your DNS infrastructure
41 DNS can break even with a large distributed system and it is never pretty.
42
43 kashani

Replies

Subject Author
Re: [gentoo-user] Running HTTP and DNS on same machine Alan McKinnon <alan.mckinnon@×××××.com>