Gentoo Archives: gentoo-cluster

From: Justin Bronder <jsbronder@×××××.com>
To: gentoo-cluster@l.g.o
Subject: Re: [gentoo-cluster] High-Availability Howto for Gentoo
Date: Sun, 09 Apr 2006 03:18:27
Message-Id: 44387D07.1080509@gmail.com
In Reply to: Re: [gentoo-cluster] High-Availability Howto for Gentoo by Hanni Ali
1 Greetings,
2
3 I'm currently employed at a site with some Xserve G5's and a smattering
4 of PIII's.
5 I cannot comment on High Availability Clusters, but I'll be more then
6 willing to
7 discuss the HPC side of clusters.
8
9 Right now we primarily run OS X on the G5's, however work is in progress
10 to allow
11 job-submission time switching between OS X and Linux (Debian or Gentoo
12 currently,
13 others in the future possibly) based upon user-submitted requests.
14
15 As we run a variety of operating systems, I personally prefer to compile
16 the HPC-
17 orientated applications from source. Anyways, I noticed a request for
18 software
19 recommendations earlier in this thread, so here's a list of the first
20 things I
21 end up installing when we build a test/development cluster, along with
22 the versions I have running.
23
24 Torque (2.0.0p5)
25 Mpich (1.2.7)
26 Mpichgm (Myrinet support, based on 1.2.6 )
27 Mpiexec (0.80)
28 Atlas (3.7.11)
29 HPL (To test the install mainly)
30
31 We also find it nice to have server(s) providing:
32 LDAP
33 DHCP and related netbooting services. (We've written our own, highly
34 alpha stage right now).
35 NFS for home directories only. We've found numerous scalability
36 problems with diskless.
37
38 Of course the shameless plug for our MyPBS package is also required,
39 http://sourceforge.net/projects/my-pbs/
40
41 This is just a quick list of of what I think any documentation on a HPC
42 cluster needs to
43 cover at minimum. I'm by no means an expert, but I would like to offer
44 my help.
45
46 Hanni Ali wrote:
47
48 > Ok,
49 >
50 > I suggest we try to put the documentation together on gentoo-wiki.com
51 > <http://gentoo-wiki.com> I've always found this site an excellent
52 > resource.
53 >
54 > There are already two stubs which I feel we should build on and kyron
55 > has compiled an excellent list of programs if you follow the links.
56 >
57 > http://gentoo-wiki.com/Index:HOWTO#Build_a_Gentoo_High_Performance_Cluster
58 >
59 > I suggest we start Build a Gentoo High Availability Cluster.
60 >
61 > http://www.gentoo.org/proj/en/cluster/
62 >
63 > This is the gentoo cluster page and we only have three How To's All of
64 > which have floors I've kept tabs on problems I've run into with the
65 > HPC howto and distcc howto. I feel we should keep openMosix separate
66 > and have a completely separate set of Howto's for that.
67 >
68 > My clusters are generally diskless nodes so I suggest we try to
69 > incorporate this howto into the gentoo-wiki
70 >
71 > http://www.gentoo.org/doc/en/diskless-howto.xml
72 > <http://www.gentoo.org/doc/en/diskless-howto.xml>
73 >
74 > Though this also has it's fair share of difficulties.
75 >
76 > I'm prepared to share a certain amount of my work on this. It would be
77 > nice to make this documentation easily understandable for all and I'm
78 > always up for people adding where they run into problems and WHY into
79 > these sort of documents.
80 >
81 > I'm looking carefully at HA diskless nodes and ways in which to ensure
82 > redundancy if the master node fails. Suggestions on this would be
83 > welcomed.
84 >
85 > How many people would be interested in helping out with this. If
86 > you've read this far it must be because it's a Friday afternoon so
87 > anything can distract you!
88 >
89 > Cheers
90 >
91 > Hanni
92 >
93 >
94
95 --
96 Justin Bronder
97 University of Maine, Orono
98
99 Advanced Computing Research Lab
100 20 Godfrey Dr
101 Orono, ME 04473
102 www.clusters.umaine.edu
103
104 Mathematics Department
105 425 Neville Hall
106 Orono, ME 04469
107
108
109 --
110 gentoo-cluster@g.o mailing list