1 |
I think this problem is actually quite common and not just restricted to |
2 |
number crunching clusters. |
3 |
|
4 |
We're running Gentoo on about 200 workstations at the University of |
5 |
Pretoria and I've spent a fair amount of time trying to find the best |
6 |
solution for keeping the machines in sync and up to date. It gets worse |
7 |
when Linux is the secondary OS and the labs are run by non-Linux people. |
8 |
|
9 |
We're using the system of creating a master image for each specific |
10 |
hardware combination and distributing that image to all the |
11 |
workstations. If I had any say in it, we would've used UDPCast from |
12 |
http://udpcast.linux.lu/, but the suits like paying for "enterprise" |
13 |
software like Ghost or Imagecast. |
14 |
|
15 |
UDPCast, together with PXE booting could work nicely. You still install |
16 |
everything on the workstations, but the boot loader is downloaded. This |
17 |
way, you can swap the bootloader at any time to "cast" any or all |
18 |
machines. UDPCast also uses multicast and sector-by-sector copying, so |
19 |
casting more machines don't take substantially longer and it doesn't |
20 |
have to understand the file system. |
21 |
|
22 |
At the moment though, we're "casting" every few months, whenever the |
23 |
other OS breaks or needs patching. Gentoo updates are made at this time |
24 |
and smaller/critical updates are made with some home-grown scripts that |
25 |
basically just looks in a specific NFS-mounted directory for scripts and |
26 |
packages at boot time. This creates some extra work with managing config |
27 |
files. I'll definitely have a look at cfengine to solve this. |
28 |
|
29 |
We're considering mounting /usr/portage from a file server. You can |
30 |
actually keep a full gentoo installation in a directory on a server and |
31 |
cross compile binaries for the clients. Config files and large updates |
32 |
are still a problem, but with the UDPCast + PXE approach, you can cast |
33 |
an up to date and configured image whenever you want. |
34 |
|
35 |
Hope this is worth something. |
36 |
|
37 |
Andrew |
38 |
|
39 |
|
40 |
On Wed, 2003-10-15 at 20:51, Kurt Lieber wrote: |
41 |
> All -- |
42 |
> |
43 |
> Curious what various suggestions you can offer for the following problem: |
44 |
> |
45 |
> I have a large cluster of machines that I will be building using Gentoo. I |
46 |
> will be using partimage for the initial installation (build one gold server |
47 |
> and clone the rest from it) and cfengine for ongoing configuration |
48 |
> management of things like /etc/ files, etc. |
49 |
> |
50 |
> The one area I'm stuck on is how to manage package upgrades. I don't want |
51 |
> to have to run 'emerge -u world' on each machine. In fact, I don't want to |
52 |
> have to compile things on each machine. Ideally, I want to only have to |
53 |
> upgrade one machine and have those changes propogate out automatically to |
54 |
> the other machines from there. (All hardware will be identical) |
55 |
> |
56 |
> I've looked at rdist, but I've never used that before, so I'm not sure how |
57 |
> well it would work. |
58 |
> |
59 |
> Another suggestion was maintaining a custom portage tree and running a |
60 |
> nightly 'emerge -u --usepkg world' on all the boxes. Then, by making a |
61 |
> change to the portage tree (which the other servers would sync from), it |
62 |
> would propogate out to the servers automatically the next time the cron job |
63 |
> ran. This is an option and probably the best one I have at this point, but |
64 |
> it seems somewhat fragile -- one slip-up in the portage tree and I'm hosed. |
65 |
> This also wouldn't work for installing new kernels |
66 |
> |
67 |
> A third suggestion was simply to use rsync. This has some undesirable CPU |
68 |
> overhead, however, as rsync recurses through the various directories. |
69 |
> |
70 |
> A final solution would be to simply re-image machines each time we want to |
71 |
> upgrade them. This is a rather brutal solution and requires on-site |
72 |
> presence, but I actually kind of like it, otherwise. |
73 |
> |
74 |
> So, what other suggestions do you have? |
75 |
> |
76 |
> --kurt |
77 |
> |
78 |
> |