Gentoo Archives: gentoo-server

From:	Dan Podeanu <pdan@×××××××××××.net>
To:	gentoo-server@l.g.o
Cc:	theboywho <theboywho@×××××××××.com>
Subject:	Re: [gentoo-server] Centralized Gentoo (build -> push/pull) to multiple architectures
Date:	Thu, 20 Oct 2005 00:43:13
Message-Id:	`003301c5d50f$0fca7780$0c01020a@nod.cc`
In Reply to:	Re: [gentoo-server] Centralized Gentoo (build -> push/pull) to multiple architectures by "Ian P. Christian"

1	Interesting topic. My solution (I finished implementing it several months
2	ago) for centralized
3	management is something like this:
4
5	Definitions:
6
7	Gentoo snapshot: a image of a Gentoo installation, that deployed on a server
8	will allow
9	booting of a running system.
10	Blade root: a server designated at one time as booting server for others.
11
12	Objectives:
13
14	1. Low maintenance costs: maintaining and applying patches to a single build
15	(Gentoo snapshots).
16	2. Low scalability overhead: scalability should be part of the design, it
17	should not take more than
18	10 minutes per server to scale up.
19	3. Redundancy: Permanent hardware failure of N-1 out of N nodes, or
20	temporary failure (power off)
21	of all nodes should allow fast (10 minutes) recovery of all nodes in a
22	cluster.
23
24	Restrictions:
25
26	1. Single CPU architecture: I consider the cost of maintaining several
27	architectures to be bigger than
28	the cost of purchasing a single architecture.
29
30	2. Unified packages tree: I consider the cost of maintaining several Gentoo
31	snapshots just to have
32	deployed the minimum of packages per server assigned to a specific
33	application (mail server, web
34	server etc.) to be bigger than having a common build with all packages and
35	just starting the required
36	services (ie. all deployed servers have a both a MTA and Apache installed,
37	just that web servers
38	have Apache started, and the mail servers have it stopped and MTA running
39	instead).
40
41	3. An application that can act as a cluster with transparent failover (web
42	with balancer and health
43	checking, multiple MX servers, etc.)
44
45	4. A remote storage for persistent data (like logs) helps (you will see
46	why); you can modify the
47	partitioning or harddisk configuration to maintain a stable filesystem on
48	individual servers.
49
50	Hardware:
51
52	1. AMD Opteron blades: 2x Opteron, 4-12gb ram, 1 SATA HDD. Reasons for
53	choosing:
54	- Opteron is cheaper, faster than Xeon, we can replace single cores with
55	dual cores at any point
56	without generating too much heat.
57
58	- 4-12gb RAM & SATA: I prefer the OS to cache a lot of things in RAM to
59	speed things up, as opposed
60	to getting a little ram and expensive SCSI. Harddisks have too many
61	moving parts and generate a lot
62	of heat, which is a problem in a dense CPU environment such as blades
63	(and you can't trust your
64	datacenter to really cool things). And RAM is cheap nowadays anyway.
65
66	2. Gigabit network cards with PXE
67
68	Software:
69
70	One initial server (blade root) is installed with Gentoo. On top of that, in
71	a directory, another
72	Gentoo is installed (Gentoo snapshot) that will be replicated on individual
73	servers as further described,
74	and all maintenance to the snapshot is done in chroot.
75
76	The Blade root runs DHCP and tftp and is able to answer PXE dhcp/tftp
77	requests (for network boot) and serve
78	an initial bootloader (grub 0.95 with diskless and diskless-undi patches to
79	allow detection of Broadcom NICs),
80	along with an initial initrd filesystem.
81
82	The Gentoo snapshot contains all the packages required for all applications
83	(roughly 2gb on our systems),
84	along with dhcp/tftp and configs, to allow it to act as Blade root.
85
86	In addition, the Blade root contains individual configurations for every
87	individual deployed server (or, rather,
88	only changes to the standard Gentoo config, ie. per-blade IPs, custom
89	application configs, different configuration
90	for services to start as boot, etc.)
91
92	The Gentoo snapshot is compressed as tar.gz in two archives, one with
93	'running code' (ie. /usr, /bin, etc.)
94	and another one with additional things we don't really need on every server
95	(like portage, usr/src, manpages, etc.).
96	The collection of scripts for compressing everything, initrd, along with all
97	individual blade configurations and misc
98	scripts are archived in a 3rd archive.
99
100	Booting takes place like this:
101
102	The Blade root is powered on and ready to answer PXE.
103
104	Blade boots, uses PXE to get an IP via DHCP, an initial grub image via tftp,
105	that subsequently downloads
106	the grub configuration file via dhcp.
107
108	Boot menu gets displayed, with a default and timeout. Grub downloads the
109	Linux kernel via tftp, along with
110	the initrd image. The kernel boots, mounts initrd, executes /linuxrc script.
111
112	The initrd contains busybox, a rsync client, fdisk/tar/gzip. When it starts,
113	it downloads the Gentoo snapshot from the
114	Blade root, along with the blade configuration and archiving scripts archive
115	(via rsync), uses fdisk to recreate a
116	partition table, creates a filesystem, uncompresses the blade configuration
117	in the target / partition, changes the root
118	and exec's init, thus booting Gentoo. At the end of Gentoo bootup, grub is
119	ran locally to install a bootloater to allow
120	booting in case the Blade root server is unavailable, and the services
121	required for the particular application the
122	blade is intended to are started.
123
124	The result is a server that has been booted remotely and is an exact image
125	of the unique blade source. It also contains
126	everything needed to boot by itself, and everything needed to further boot
127	other servers. Thanks to the individual
128	per-server configuration, only services meant for that machine are started,
129	and the bootup only takes 3-4 minutes more
130	than usual.
131
132	After booting one blade, using it as source to boot the initial Blade root
133	ensures all servers involved share the
134	same setup, thus a self-replicating system.
135
136	Maintenance costs are related only to updating the Blade root's blade
137	snapshot. After an emerge -u world,
138	rebooting the other blades one at a time will distribute the changes. On top
139	of this you can have whatever
140	synchronization method you want for the particular application you're
141	deploying.
142
143	I hope this helps.
144
145	Cheers,
146	Dan.
147
148
149	----- Original Message -----
150	From: "Ian P. Christian" <pookey@×××××××××.uk>
151	To: <gentoo-server@l.g.o>
152	Cc: "theboywho" <theboywho@×××××××××.com>
153	Sent: Thursday, October 20, 2005 12:22 AM
154	Subject: Re: [gentoo-server] Centralized Gentoo (build -> push/pull) to
155	multiple architectures
156
157
158	--
159	gentoo-server@g.o mailing list

Replies

Subject	Author
Re: [gentoo-server] Centralized Gentoo (build -> push/pull) to multiple architectures	Ramon van Alteren <ramon@××××××××××.nl>

Report Message

Find on MARC Find on Google Groups