Gentoo Archives: gentoo-server

From: Ramon van Alteren <ramon@××××××××××.nl>
To: gentoo-server@l.g.o
Cc: Dan Podeanu <pdan@×××××××××××.net>
Subject: Re: [gentoo-server] Centralized Gentoo (build -> push/pull) to multiple architectures
Date: Sun, 30 Oct 2005 21:35:44
Message-Id: BAE46B9B-9022-4DC4-A767-37E6D927C2CC@vanalteren.nl
In Reply to: Re: [gentoo-server] Centralized Gentoo (build -> push/pull) to multiple architectures by Dan Podeanu
1 Hi Dan,
2
3 On 20 Oct , 2005, at 2:41 AM, Dan Podeanu wrote:
4 > Interesting topic.
5
6 Indeed, I'm moving to a different employer and was considering a
7 similar setup...
8 I'm curious about a number of things:
9
10 What's the scale of the cluster you're using this setup on ?
11 Would you be willing / able to share some of the work ?
12 I'd be very interested to look at your setup before I start my own.
13
14 Any comments on the hardware stability of the nodes you're using?
15 Which make blades are you using?
16
17 I was also wondering whether you are familiar with the work of http://
18 www.infrastructures.org/
19 Your setup has many of it's characteristics.
20
21 > Objectives:
22 >
23 > 1. Low maintenance costs: maintaining and applying patches to a
24 > single build
25 > (Gentoo snapshots).
26 > 2. Low scalability overhead: scalability should be part of the
27 > design, it
28 > should not take more than 10 minutes per server to scale up.
29 > 3. Redundancy: Permanent hardware failure of N-1 out of N nodes, or
30 > temporary failure (power off) of all nodes should allow fast (10
31 > minutes) recovery of all nodes in a
32 > cluster.
33
34 I read below that all nodes include configs for dhcp/tftp in order to
35 be able to take over the golden (blade root) server. How do you
36 handle that? In case of downtime of the main blade root server which
37 of the nodes gets to take over? Is that an automatic or a manual
38 process?
39
40 Additionally, did you test a all node failure and how did the master
41 blade root cope with the strain of all nodes booting at once? What
42 hardware are you using for the blade root server ?
43
44 > Restrictions:
45 >
46 > 1. Single CPU architecture: I consider the cost of maintaining several
47 > architectures to be bigger than the cost of purchasing a single
48 > architecture.
49
50 Are you running a full 64-bit setup or 32-bit compatibility mode ?
51 What are your experiences with stability in 64-bit case ? Especially
52 curious about php and it's diverse set of external libs. Do agree
53 though, any thoughts on the inevitable upgrade that's going to show
54 up some time in the future when your current hardware platform is no
55 longer available ?
56
57 > 2. Unified packages tree: I consider the cost of maintaining
58 > several Gentoo
59 > snapshots just to have deployed the minimum of packages per server
60 > assigned to a specific
61 > application (mail server, web server etc.) to be bigger than having
62 > a common build with all packages and just starting the required
63 > services (ie. all deployed servers have a both a MTA and Apache
64 > installed, just that web servers have Apache started, and the mail
65 > servers have it stopped and MTA running instead).
66
67 Agreed, doesn't pay off to have seperate base-sets for the different
68 type of nodes, and it's good on redundancy, if needed a former
69 webserver can stand in as a database server etc..
70
71 > 3. An application that can act as a cluster with transparent
72 > failover (web
73 > with balancer and health checking, multiple MX servers, etc.)
74
75 I don't understand this restriction?
76
77 > 4. A remote storage for persistent data (like logs) helps (you will
78 > see why);
79 > you can modify the partitioning or harddisk configuration to
80 > maintain a stable filesystem on individual servers.
81
82 <snipped>
83
84 > Software:
85 >
86 > One initial server (blade root) is installed with Gentoo. On top of
87 > that, in
88 > a directory, another Gentoo is installed (Gentoo snapshot) that
89 > will be replicated on individual
90 > servers as further described, and all maintenance to the snapshot
91 > is done in chroot.
92 >
93 > The Blade root runs DHCP and tftp and is able to answer PXE dhcp/tftp
94 > requests (for network boot) and serve an initial bootloader (grub
95 > 0.95 with diskless and diskless-undi patches to allow detection of
96 > Broadcom NICs), along with an initial initrd filesystem.
97 >
98 > The Gentoo snapshot contains all the packages required for all
99 > applications
100 > (roughly 2gb on our systems), along with dhcp/tftp and configs, to
101 > allow it to act as Blade root.
102
103 See question above, is switching manual ?
104
105 > In addition, the Blade root contains individual configurations for
106 > every
107 > individual deployed server (or, rather, only changes to the
108 > standard Gentoo config, ie. per-blade IPs, custom application
109 > configs, different configuration for services to start as boot, etc.)
110
111 Do you use classes here (e.g. webserver, databaseserver, mailserver,
112 cachingserver etc.)?
113 Or do you maintain individual setups for each server?
114 What scripting language did you choose for the config scripts and
115 stuff and why that script lang?
116
117 <booting process snipped>
118
119 I'm also curious as to what QA procdures you have in place to prevent
120 accidental mistakes on the blade root server. I assume you test
121 beforehand ? On all server classes ? Modifications to the third
122 archive with the per-server config seem rather difficult to test.
123
124 > I hope this helps.
125
126 Oh it sure did, it confirmed some ideas i was already thinking about
127 and gave me a real world example that it can be done :-)
128
129 Thanks,
130
131 Ramon
132 --
133 Change what you're saying,
134 Don't change what you said
135
136 The Eels
137
138
139
140 --
141 gentoo-server@g.o mailing list