Gentoo Archives: gentoo-server

From: Ramon van Alteren <ramon@××××××××××.nl>
To: gentoo-server@l.g.o, cduffy@×××××××.net
Subject: Re: [gentoo-server] Best practices in managing large server groups
Date: Mon, 21 May 2007 23:15:23
Message-Id: 4652270A.7020906@vanalteren.nl
In Reply to: [gentoo-server] Best practices in managing large server groups by Charles Duffy
1 Hi Charles
2
3 I've been looking for time to answer this more fully than the quick
4 oneshot mail I send off earlier.
5 We run 600 servers on gentoo and started with a single server originally.
6
7 So yes it's definitely doable :)
8
9 I'll try and answer as much as possible, any questions feel free to mail.
10
11 Charles Duffy wrote:
12 > I'm looking at replacing SuSE SLES9 with Gentoo for an enterprise
13 > application (for reasons of flexibility and licensing) (no, we don't
14 > have an enterprise application budget -- just the reliability
15 > requirements; yaaay, startups!). We're looking to be able to deploy and
16 > manage hundreds of geographically distributed servers.
17 See above, what is your planned initial deployment ? Are you starting
18 with a hundred or more servers or are you starting with just a couple ?
19 > We have a QA department available to vet each configuration before it is
20 > deployed to the field. We have infrastructure for tracking the progress
21 > of code in svn from creation though QA to deployment; I'm anticipating
22 > tracking a local overlay (containing all packages we use), make.conf,
23 > /etc/portage/*, etc. through this system, autobuilding system images
24 > (either to run virtualized or on real hardware) from the contents of
25 > svn, building binary packages and deploying them to real hardware.
26 Having a QA department to offload this work to is certainly a bonus :)
27 > I'm interested in best practices, suggested tools, and/or 3rd party
28 > experiences in this regard.
29 >
30 > Some particular questions which come to mind:
31 > - Should I be using a custom profile or a standard profile with
32 > overrides through make.conf, /etc/portage/* and the like?
33 AFAIK you should be able to set all required stuff through overrides.
34 The point is to keep in mind the benefits of using gentoo and not try
35 and work against the system.
36 Several people discussed running gentoo servers in produktion without a
37 build toolchain (gcc etc.)
38 I have no comments to offer on how desirable this is, but if this is a
39 goal/requirement for you I'd strongly suggest using a binary distro.
40 Gentoo shines as a ultra-configurable source-based distro, running it
41 without a build toolchain and / or portage tree is certainly possible.
42 It would however take away much of the advantages of using gentoo, so
43 why not switch to something else in that case.
44
45 Removing the portage tree has always been a weird question to me, nobody
46 discusses removing the rpm-package database, why are people so keen on
47 removing the portage tree ?
48 It takes roughly 550Mb of space which is quite a lot, but hardly a
49 killing requirement given todays diskspace and hardware.
50 The linux kernel source tree is roughly in the same sizing category.
51 > - What's the Right Way to create new system images ready to be loaded
52 > onto a hard drive or run through a virtual machine? gentoo-buildhoster
53 > looks interesting. I've seen Catalyst mentioned as a way to create
54 > stage3 images, but what documentation I've been able to find doesn't
55 > seem very much targeted for my use cases.
56 I would recommend catalyst-2. Although documentation is lacking, it
57 isn't that hard to setup.
58 You're probably looking for the stage4 target if you want to build
59 system images.
60 Rolling out gentoo on such a large scale, you need a repeatable system
61 image build environment.
62
63 The bonus of catalyst is that you automatically get a binary package
64 server in the process of generating your images.
65 Catalyst can be told to use a binary package cache, by carefully setting
66 up your catalyst environment you can easily reuse that as source for
67 your binpkg server.
68 I'm not familiar with gentoo-buildhoster, but since it's webpage [1]
69 lists it as no longer maintained that would be a no-go area for me.
70
71 We combined catalyst with a pxe based boot environment, the quickstart
72 installer [2] and puppet[3].
73 It allows us to provision a server within 30 minutes. That's 30 minutes
74 from connecting the hardware to a switch and active in production.
75 This requires very little manual intervention, which we consider to be a
76 good thing (tm)
77 And yes, that's concurrent, we believe it to be capable of roughly 30
78 servers concurrent setup and that appears to be a pxe limitation.
79
80 If at all possible, try to build your deployment system thus that you
81 can always easily wipe a server and reinstall.
82 We didn't originally and are refactoring to allow it now.
83 > - Any experiences with puppet? With out ratio of servers to staff,
84 > automating configuration and administration is a priority. (We already
85 > have an internal tool written with automating the server configuration
86 > process in mind; it has some functionality puppet doesn't, and puppet
87 > has functionality it doesn't; in theory, I'd like to extend puppet until
88 > our internal tool becomes unnecessary, though I'll need to understand
89 > puppet much better before I can think too hard about that).
90 We are using puppet extensively. It works, although it's still rough
91 around the edges which is as expected from such a young project.
92 The gentoo provider for puppet is in it's infancy. It works, but
93 definitly needs work as well.
94
95 Apart from that puppet is a very versatile and powerful tool.
96 And most importantly it has a very active community of people around it.
97 They are actively exchanging recipes for server configuration, which is
98 useful in itself but becomes extremely useful when combined with
99 the new module organization in puppet.
100
101 Many problems you will face in deploying and configuring such an amount
102 of servers will have been solved in whole or in part by someone in the
103 puppet community.
104
105 I would be curious what functionality is missing from puppet right now
106 in your opinion ?
107 > - Have any of 'yall been in the 100s-of-servers situation with
108 > comparable requirements and come up with a different approach to
109 > managing it? How did things work out?
110 We've grown very very fast and have tried different methods along the way.
111 To be honest, we're still in the process of moving most of the
112 serverpark under puppet control (nearly 50% done)
113 And I actually do not expect to find a single set of tools that cope
114 with all the issues that you face when deploying such an amount of servers.
115 Your situation might be different because you are starting with a single
116 app.
117
118 One vital thing that is missing from the picture is an inventory database.
119 You need some sort of queryable database that stores servers, location,
120 networking info, function, server-identifyable serial of some sort,
121 hardware classes, deployment status etc.
122 Without that you're basically lost. We use a homebrew mysql based system
123 with both a cli- and a webinterface.
124
125 Apart from that, you'll need lots of infrastructure:
126 * logservers
127 * monitoring
128 * statistics gathering
129 * backups
130 * scripts repository
131 * version controlled configs
132 * bug / issue tracking
133 * firewalling
134 * loadbalancing
135 * network monitoring and configuration system
136 * etc. etc. etc.
137 > Thank you!
138 You're welcome, I would like to see more people use gentoo in
139 large-scale environments and am actively looking for possibilities to
140 exchange experiences.
141 Feel free to contact me if you have any questions. I also hang out at a
142 couple of irc-channels [4] with nickname Innocenti
143
144 Out of pure curiosity what is your staff to server ratio ?
145
146 Grtz Ramon
147
148 Senior System Administrator Hyves.nl
149
150 [1]http://badpenguins.com/source/
151 [2]http://dev.gentoo.org/~agaffney/quickstart.php
152 [3]http://puppet.reductivelabs.com/
153 [4]#puppet, #gentoo-server, #gentoo-cluster, #gentoo-amd64
154 --
155 gentoo-server@g.o mailing list

Replies

Subject Author
Re: [gentoo-server] Best practices in managing large server groups Brian Kroth <bpkroth@××××.edu>
Re: [gentoo-server] Best practices in managing large server groups Charles Duffy <cduffy@×××××××.net>