Gentoo Archives: gentoo-user

From: Joshua Murphy <poisonbl@×××××.com>
To: gentoo-user@l.g.o
Subject: Re: Gentoo for many servers (was: Re: [gentoo-user] executing commands on lots of servers at once)
Date: Sun, 15 Nov 2009 08:01:08
Message-Id: c30988c30911142220t5608232cp19fc799819d62b8f@mail.gmail.com
In Reply to: Re: Gentoo for many servers (was: Re: [gentoo-user] executing commands on lots of servers at once) by Alex Schuster
1 On Sat, Nov 14, 2009 at 5:09 PM, Alex Schuster <wonko@×××××××××.org> wrote:
2 > Alan McKinnon writes:
3 >
4 >> On Saturday 14 November 2009 19:36:06 Alex Schuster wrote:
5 >>> Alan McKinnon wrote:
6 >
7 >>>> clusterssh will let you log into many machines at once and run emerge
8 >>>> -avuND world everywhere
9 >>> This is way cool. I just started using it on eight Fedora servers I am
10 >>> administrating. Nice, now this is an improvement over my 'for $h in
11 >>> $HOSTS; do ssh $h "yum install foo"; done' approach.
12 >>
13 >> I feel your pain :-)
14 >>
15 >> We used to have the same problem adding new admins to 87 machines. Now
16 >> we have a bespoke provisioner that does it all.
17 >
18 > Sorry, I just do not get 'bespoke provisioner'. Some sort of software,
19 > like clusterssh? Or a person, one admin instead of many?
20 >
21 >
22 >>> What do you guys think about using Gentoo for servers? At the institute
23 >>> I partially work we chose Fedora. There is no special reason for that -
24 >>> we already had some Fedora machines, the setup seemed to work, the
25 >>> reputation was good, so we kept it. That was okay for me, why choose
26 >>> many different environments and learn everything again. I mentioned
27 >>> Gentoo, but did not really suggest to actually use it. Maybe I should
28 >>> have.
29 >>
30 >> I'm a huge fan of Gentoo
31 >
32 > Now who would have thought of that!
33 >
34 >> and all my personal machines (except the new netbook have run it for the
35 >> last 5 years.
36 >>
37 >> But I will never install Gentoo on a production server at work.
38 >>
39 >> Why?
40 >>
41 >> Because it is too time consuming, because no two machines are set up the
42 >> same, because I can't trust that other admins used the flags they should
43 >> have. So updates become a case of logging into 80+ machines individually
44 >> and doing emerge world by hand. Gentoo allows you to customize things to
45 >> the nth degree - that is it's strength - so people WILL use this one
46 >> discriminating factor.
47 >>
48 >> If OTOH I had a server farm of 80+ machines, all identical, I'd put
49 >> Gentoo on them in a flash. But I don't have that
50 >
51 > Of our 8 machines, 7 are essentially the same and differ only in hard
52 > drive space and CPU speed. The other machine is Intel, not AMD, and needs
53 > different IDE drivers. At the moment it has a different initrd (I set up a
54 > minimal fedora install to generate it after the cloned system did not
55 > boot), the rest is - apart from some config files - identical.
56 >
57 > So I would make sure that about everything is exactly the same, well,
58 > maybe except for hostnames, udev net-persistent-rules, ssh keys... what
59 > more?
60 > The last, a little different machine is a problem though. With optimized
61 > CFLAGS, this one would have to compile all stuff again, while for the
62 > others I could use binpkgs. Updating them all with clusterssh should not
63 > be much more work than updating a single one. Well, not completely true, I
64 > would have the double work, as I would upgrade one server first to test if
65 > there are problems, and then do it for the others. Maybe I could use the
66 > special machine to test stuff, and then update all the others.
67 >
68 > If they would differ, Gentoo would of course be too much work. I already
69 > have this problem now... there is my desktop machine, my notebook running
70 > a Gentoo VM, a second desktop machine at my other home, the living-room
71 > machine of my flat share, the machine of a fried I also administrate, the
72 > server of my flat share I need to set up again... and clusterssh is no
73 > option here.
74
75 My potentially ill informed thoughts on the above issues/ideas:
76
77 1) Pick one machine to host both your make.conf as well as your
78 portage tree and distfiles, potentially splitting them into separate
79 nfs mounts shared out for the rest of the hosts (having the portage
80 tree itself ro on all but its owning machine forces centralization of
81 syncing).
82
83 2) /etc/make.conf should simply be a symlink to the centrally located
84 copy. If you must use binpackages, set march to something that will
85 run on every machine involved, then set mcpu to whatever machine is
86 most common if you want to get just a bit more performance here or
87 there. If you don't mind compiling on every host, though, set portage
88 niceness to something friendly to your users and march to native (if
89 you plan to use distcc, this is a BAD idea, use the binpackages).
90
91 3) use a replaceable (otherwise identical to the others, and therefore
92 able to be brought back online by just cloning it over) system for
93 your testing and keep frequent scheduled backups of whichever system
94 plays host to your portage tree, binpackages, and distfiles.
95
96 4) build your kernel with built in drivers for every piece of
97 boot-time essential hardware in your systems. You'll still be on a far
98 cleaner setup than a mass produced distro provided kernel, you'll only
99 need to maintain one for all your systems, and you'll only have one
100 kernel to worry about building against if you need any out-of-kernel
101 modules as well.
102
103 5) script the changing of ssh host keys (or even redistribution of
104 them, if you ), removal of persistent net rules, and prompting for the
105 setting of host name and you'll have a nice, tiny, postinstall tool
106 for the rare case in which you need to re-deploy a system. You may
107 wish to restore things like ssh host keys from backups as well, in the
108 case of re-deployment of systems, since changing them means adjusting
109 known hosts lists elsewhere
110
111 >>> Now I am thinking about a Gentoo installation instead.
112 >>>
113 >>> Pros:
114 >>>  - Continuous updates, no downtime for upgrading, only when I decide to
115 >>> install a new kernel. This is really really cool. I fear the upgrade
116 >>> from Fedora 10 to 12 which has to be done soon.
117 >>
118 >> Do not upgrade, especially not with a version jump of 2 or more. If you
119 >> have a  lot of machines, I assume you are a decent shop, and that you
120 >> have some form of formal process for upgrades and changes.
121 >
122 > Not really, I think. We are not very professional I must admit. We have
123 > two capable admins, but one is specialized in network stuff and Windows,
124 > the other has to do with our big Sun servers, huuge storage systems and
125 > such. They do not much about the Linux cluster. Another user sometimes
126 > installs a package on a machine, but usually I do this. For me, it is not
127 > my main job, I work only about ten hours per week there, mostly being some
128 > 100 km away.
129 > We are a research institute. We do neurological research, PET and MRI
130 > tomography. The Linux servers do number crunching, and of course they
131 > should work and have good uptimes, but it is not as important as if we
132 > were an ISP.
133 >
134 >> What you do instead is a formal migration - copy the data off,
135 >> reinstall, restore data.
136 >
137 > Advice noted. Yes, this sounds like the better idea, giving a cleaner
138 > setup. And if some things break I do not have to wonder if it was some
139 > strange side effect from the upgrade process.
140 >
141 >> If you can't afford to do that every six or twleve months, then
142 >> I have to ask - what the hell is the organization doing using a distro
143 >> that is unsupported after 12 months?
144 >
145 > Well, I do not think this was considered much. One machine was set up with
146 > Fedora for no specific reason, and we kept this distro then. This does not
147 > sound too professional, I know. BTW, what distro would you suggest?
148
149 In the times I've used it, while a bit overweight for my tastes in
150 server work, Ubuntu handled updates quite gracefully, but needed
151 reboots somewhat often. You might get the same or better out of
152 Debian, as it's created a little less directly to be destktop centric,
153 while being the source of the package management that gives Ubuntu
154 what advantages it might have for the role.
155
156 >>> - Some improvement in speed. Those machines do A LOT of
157 >>> numbercrunching, which jobs often lasting for days, so even small
158 >>> improvements would be nice.
159 >>
160 >> Don't fool yourself. Unless you need what Google needs, there is very
161 >> little speed difference between Gentoo and Fedora. I/O improvements you
162 >> need can be  easily gotten by fiddling the kernel tuning knobs.
163 >
164 > I know the difference will not be huge, I see this as a little bonus -
165 > nice if is there, but nothing really important. But in the comparison with
166 > Ubuntu that came in a thread a few weeks ago, for some applications the
167 > speed increase was about 30 percent. Although I would not necessarily
168 > expect the difference to be noticeable, I would also not be surprised too
169 > much if it were noticeable for some number-crunching applications if they
170 > were optimized for the CPU.
171
172 Are the pieces of software you're using for the number crunching work
173 open source, and will you be recompiling those on Gentoo, with all the
174 optimizations, as well? In the long run, if they're not, you'll get
175 far more out of the I/O improvements Alan mentioned than you ever
176 would out of aggressive use of cflags.
177
178 >>>  - Easier debugging. When things do not work, I think it's easier to
179 >>> dig into the problem. No fancy, but sometimes buggy GUIs hiding basic
180 >>> functionality.
181 >>
182 >> Errrrrrrrrrrrrrrrmmmmmmmmmmmmmm, Fedora does not require a GUI :-)
183 >
184 > Right, and now that I think of it I do not use it anyway... Well, I did do
185 > some things with netsetup (or whatever it's called), now that I know the
186 > system a little better I edit things directly in /etc/sysconfig.
187 > But the installer is a GUI, right? And if I remember this correctly, I
188 > cannot even switch to a text console and do stuff there while installing.
189 > Or I could, but did not have utilities like LVM. Something like that. I
190 > have to use the installer and its capabilities.
191 >
192 >>> - Heck, Gentoo is _cooler_ than typical distributions. And emerging
193 >>> with distcc on about 8*4 cores would be fun :)
194 >>
195 >> Can't argue with that.
196 >>
197 >> But that is your ego talking and the machines do not belong to you but
198 >> to the institute. Your ego has no place in that.
199 >
200 > You're right, thanks for the reminder. But also note the smiley. I know my
201 > boss (who is also into geeky things) would also like this - as long as it
202 > would work.
203
204 If you've a moderately capable system sitting spare, throw virtualbox
205 or similar on it and bring up a few vms to test the setup in (since
206 with that, you can get away with ). My little core 2 here can handle
207 3-4 vms without fussing at all, and that's with
208
209 >>>  - I am probably the only one who can administrate them.
210 >>
211 >> This is not a benefit. It is a severe liability.
212 >
213 > That's why I listed it also on the contra side. Forgot to add a smiley
214 > here, it was not meant seriously.
215 > But when I think about it... the others also do not know much about
216 > Fedora. Not even I do this well. There you use 'yum install <package>',
217 > with Gentoo it's 'emerge <package>'. Daily work would be similar.
218 > Upgrades would be a different thing, though. Gentoo's portage blockers
219 > would not be understood easily, they would prefer to take the servers down
220 > and just install the current Fedora distro. Which hopefully would work.
221 >
222 >
223 >>> Cons:
224 >>> - If something will not work with this not so common
225 >>> (meta)distribution, people will say "always trouble with your Gentoo
226 >>> Schmentoo, it works fine in Fedora". Fedora is more mainstream, if
227 >>> something does not work there, then it's okay for the people to accept
228 >>> it.
229 >>
230 >> Those same people are likely to say the same about linux vs windows.
231 >
232 > Right, but we already have Linux, and we need it for our software. Gentoo
233 > would not really be needed.
234 >
235 >>> - I am probably the only one who can administrate them. I think Gentoo
236 >>> is easier to maintain in the long run, but only when you take the time
237 >>> to learn it. With Fedora, you do not need much more than the 'yum
238 >>> install' command. There is no need to read complicated X.org upgrade
239 >>> guides and such.
240 >>>
241 >>> I think I already made my decision, but I am still interested in your
242 >>> opinions, maybe some of you are in a similar position and like to share
243 >>> your experiences. Whether I will be allowed to use Gentoo is another
244 >>> question, I guess my boss will not like my idea at first, and I am not
245 >>> even sure if he is right. But maybe I can test-install Gentoo on one
246 >>> machine in a chroot, and see if things work fine.
247 >>
248 >> Depends how critical these machines are. If you want to change them just
249 >> because you feel like it, then I do not see how that can possibly be a
250 >> valid reason.
251 >>
252 >> Remember, the institute's needs and desires trump yours every time
253 >
254 > No, it's not just because I feel like it. The main advantages would be:
255 > - No downtime between upgrades. Our jobs run for several days, every
256 > downtime has to be planned in advance. People understand this, but they do
257 > not like it. They would be very happy if this were not longer necessary.
258 > And I would not fear that during the upgrade something breaks, and it
259 > would take me long to fix it.
260 > - I know this distro well, and this is not at all true about Fedora. I
261 > know how to fix problems, I know how things work here. I would feel better
262 > with Gentoo, more competent. It just does not feel so well to administrate
263 > Fedora.
264 >
265 > Thanks for your opinions, Alan. As always.
266 >
267 >        Wonko
268
269 As a final note... whatever path you take in either implementing a new
270 setup or just updating the old one, document it, and especially
271 document guides for upkeep and general maintenance. Your boss, Windows
272 guy, and Sun guy're going to be like fish out of water if you get this
273 whole thing put in place and get hit by a bus the next day. *This* is
274 why the "I'm the only one that can.." bit is such a dangerous thing.
275 It's not the fear that you'll try to use it as a bargaining chip down
276 the road, given that, they'd just take the hit, replace you, and then
277 replace the setup with a better documented one... it's the fear that
278 if for any reason you drop out of the picture for them, they're stuck
279 with the cost of doing that. Period. (This is also why, when actively
280 and intentionally done, it's a fire-able offense in many places)
281
282 --
283 Poison [BLX]
284 Joshua M. Murphy