1 |
This may be a side note but here is my take on the whole thing: |
2 |
Our cluster has 132 nodes, plus we have another cluster with 60. I am |
3 |
putting Gentoo on both of them now. (Tonight I am doing the big cluster) |
4 |
|
5 |
Update procedure: |
6 |
Compile the new update using a compute node and the scheduler: |
7 |
# qsub -b y -N update emerge -B <foobar> |
8 |
|
9 |
# Now install. |
10 |
pdsh -a |
11 |
pdsh> emerge -k <foobar> |
12 |
|
13 |
Now every node has the update. =) |
14 |
|
15 |
Best part about it is that we can setup pdsh to connect to our |
16 |
install image as well as our compute nodes so everything is updated |
17 |
all at once. |
18 |
|
19 |
The only thing we had to do to get this to work was NFS mount the / |
20 |
usr/portage directory. |
21 |
|
22 |
Plus if a node gets out of whack with everything else we run: |
23 |
# ssh <node> dd if/dev/zero of=/dev/<hda,sda> bs=1024 count=1 |
24 |
# ssh <node> reboot |
25 |
|
26 |
After that YACI takes over and everything is imaged in about 10 minutes. |
27 |
|
28 |
|
29 |
total update process takes about 5 minutes minus compile time. |
30 |
|
31 |
|
32 |
On Nov 20, 2005, at 5:51 PM, Stéphane Lacasse wrote: |
33 |
|
34 |
>>> For this reason I way prefer Rocks Cluster |
35 |
>>> that is really a breeze to install, but they do not have the bootp |
36 |
>>> paradigm... |
37 |
>> |
38 |
>> Which comes back to my original post that started the thred. I wat |
39 |
>> to make an |
40 |
>> entirely _Gentoo_ based cluster. A good reason for this is that |
41 |
>> Gentoo |
42 |
>> well...is Gentoo, don't want/need to start the philosophical |
43 |
>> debate on why |
44 |
>> Gentoo is better than RHE(WS), CentOS and all on which Rocks is |
45 |
>> based... |
46 |
>> |
47 |
> Hey, if you can create a Gentoo base cluster that is as easy to |
48 |
> install |
49 |
> and maintain than Rocks Cluster is, you have my support. Being using |
50 |
> Gentoo for 3 years, the benefit is obvious to me ;) |
51 |
> |
52 |
>>> each node is a full image on it's own, but managmenet is |
53 |
>>> centralized thrue the headnode. |
54 |
>> |
55 |
>> O_o.... now _that_ is something I would call inefficient. I can't |
56 |
>> immagine a |
57 |
>> 1024 node cluster running off 1024 images stored on one server. |
58 |
> |
59 |
> In fact, there are no images. Each node is a full system on the hard |
60 |
> drive. The head node can "order" the nodes to updates it's softwares, |
61 |
> so updating the headnode will take care to also update the nodes |
62 |
> automaticaly. |
63 |
> |
64 |
> Like you said, it's beside the point, but I think you could take a |
65 |
> look |
66 |
> at their design and draw some inpirations from it. An easy install |
67 |
> like |
68 |
> Rock Cluster but instead of using Kickstart(tm) files, would use the |
69 |
> emerge system + distcc + quickpkg. |
70 |
> |
71 |
> -- |
72 |
> gentoo-cluster@g.o mailing list |
73 |
> |
74 |
|
75 |
|
76 |
-- |
77 |
gentoo-cluster@g.o mailing list |