Gentoo Archives: gentoo-server

From: Dan Podeanu <pdan@×××××××××××.net>
To: gentoo-server@l.g.o
Cc: theboywho <theboywho@×××××××××.com>
Subject: Re: [gentoo-server] Centralized Gentoo (build -> push/pull) to multiple architectures
Date: Thu, 20 Oct 2005 00:43:13
Message-Id: 003301c5d50f$0fca7780$0c01020a@nod.cc
In Reply to: Re: [gentoo-server] Centralized Gentoo (build -> push/pull) to multiple architectures by "Ian P. Christian"
1 Interesting topic. My solution (I finished implementing it several months
2 ago) for centralized
3 management is something like this:
4
5 Definitions:
6
7 Gentoo snapshot: a image of a Gentoo installation, that deployed on a server
8 will allow
9 booting of a running system.
10 Blade root: a server designated at one time as booting server for others.
11
12 Objectives:
13
14 1. Low maintenance costs: maintaining and applying patches to a single build
15 (Gentoo snapshots).
16 2. Low scalability overhead: scalability should be part of the design, it
17 should not take more than
18 10 minutes per server to scale up.
19 3. Redundancy: Permanent hardware failure of N-1 out of N nodes, or
20 temporary failure (power off)
21 of all nodes should allow fast (10 minutes) recovery of all nodes in a
22 cluster.
23
24 Restrictions:
25
26 1. Single CPU architecture: I consider the cost of maintaining several
27 architectures to be bigger than
28 the cost of purchasing a single architecture.
29
30 2. Unified packages tree: I consider the cost of maintaining several Gentoo
31 snapshots just to have
32 deployed the minimum of packages per server assigned to a specific
33 application (mail server, web
34 server etc.) to be bigger than having a common build with all packages and
35 just starting the required
36 services (ie. all deployed servers have a both a MTA and Apache installed,
37 just that web servers
38 have Apache started, and the mail servers have it stopped and MTA running
39 instead).
40
41 3. An application that can act as a cluster with transparent failover (web
42 with balancer and health
43 checking, multiple MX servers, etc.)
44
45 4. A remote storage for persistent data (like logs) helps (you will see
46 why); you can modify the
47 partitioning or harddisk configuration to maintain a stable filesystem on
48 individual servers.
49
50 Hardware:
51
52 1. AMD Opteron blades: 2x Opteron, 4-12gb ram, 1 SATA HDD. Reasons for
53 choosing:
54 - Opteron is cheaper, faster than Xeon, we can replace single cores with
55 dual cores at any point
56 without generating too much heat.
57
58 - 4-12gb RAM & SATA: I prefer the OS to cache a lot of things in RAM to
59 speed things up, as opposed
60 to getting a little ram and expensive SCSI. Harddisks have too many
61 moving parts and generate a lot
62 of heat, which is a problem in a dense CPU environment such as blades
63 (and you can't trust your
64 datacenter to really cool things). And RAM is cheap nowadays anyway.
65
66 2. Gigabit network cards with PXE
67
68 Software:
69
70 One initial server (blade root) is installed with Gentoo. On top of that, in
71 a directory, another
72 Gentoo is installed (Gentoo snapshot) that will be replicated on individual
73 servers as further described,
74 and all maintenance to the snapshot is done in chroot.
75
76 The Blade root runs DHCP and tftp and is able to answer PXE dhcp/tftp
77 requests (for network boot) and serve
78 an initial bootloader (grub 0.95 with diskless and diskless-undi patches to
79 allow detection of Broadcom NICs),
80 along with an initial initrd filesystem.
81
82 The Gentoo snapshot contains all the packages required for all applications
83 (roughly 2gb on our systems),
84 along with dhcp/tftp and configs, to allow it to act as Blade root.
85
86 In addition, the Blade root contains individual configurations for every
87 individual deployed server (or, rather,
88 only changes to the standard Gentoo config, ie. per-blade IPs, custom
89 application configs, different configuration
90 for services to start as boot, etc.)
91
92 The Gentoo snapshot is compressed as tar.gz in two archives, one with
93 'running code' (ie. /usr, /bin, etc.)
94 and another one with additional things we don't really need on every server
95 (like portage, usr/src, manpages, etc.).
96 The collection of scripts for compressing everything, initrd, along with all
97 individual blade configurations and misc
98 scripts are archived in a 3rd archive.
99
100 Booting takes place like this:
101
102 The Blade root is powered on and ready to answer PXE.
103
104 Blade boots, uses PXE to get an IP via DHCP, an initial grub image via tftp,
105 that subsequently downloads
106 the grub configuration file via dhcp.
107
108 Boot menu gets displayed, with a default and timeout. Grub downloads the
109 Linux kernel via tftp, along with
110 the initrd image. The kernel boots, mounts initrd, executes /linuxrc script.
111
112 The initrd contains busybox, a rsync client, fdisk/tar/gzip. When it starts,
113 it downloads the Gentoo snapshot from the
114 Blade root, along with the blade configuration and archiving scripts archive
115 (via rsync), uses fdisk to recreate a
116 partition table, creates a filesystem, uncompresses the blade configuration
117 in the target / partition, changes the root
118 and exec's init, thus booting Gentoo. At the end of Gentoo bootup, grub is
119 ran locally to install a bootloater to allow
120 booting in case the Blade root server is unavailable, and the services
121 required for the particular application the
122 blade is intended to are started.
123
124 The result is a server that has been booted remotely and is an exact image
125 of the unique blade source. It also contains
126 everything needed to boot by itself, and everything needed to further boot
127 other servers. Thanks to the individual
128 per-server configuration, only services meant for that machine are started,
129 and the bootup only takes 3-4 minutes more
130 than usual.
131
132 After booting one blade, using it as source to boot the initial Blade root
133 ensures all servers involved share the
134 same setup, thus a self-replicating system.
135
136 Maintenance costs are related only to updating the Blade root's blade
137 snapshot. After an emerge -u world,
138 rebooting the other blades one at a time will distribute the changes. On top
139 of this you can have whatever
140 synchronization method you want for the particular application you're
141 deploying.
142
143 I hope this helps.
144
145 Cheers,
146 Dan.
147
148
149 ----- Original Message -----
150 From: "Ian P. Christian" <pookey@×××××××××.uk>
151 To: <gentoo-server@l.g.o>
152 Cc: "theboywho" <theboywho@×××××××××.com>
153 Sent: Thursday, October 20, 2005 12:22 AM
154 Subject: Re: [gentoo-server] Centralized Gentoo (build -> push/pull) to
155 multiple architectures
156
157
158 --
159 gentoo-server@g.o mailing list

Replies

Subject Author
Re: [gentoo-server] Centralized Gentoo (build -> push/pull) to multiple architectures Ramon van Alteren <ramon@××××××××××.nl>