Gentoo Archives: gentoo-user

From: "Boyd Stephen Smith Jr." <bss03@××××××××××.net>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Linux Cluster
Date: Sat, 01 Jul 2006 05:57:52
Message-Id: 200607010051.44525.bss03@volumehost.net
In Reply to: [gentoo-user] Linux Cluster by Bruno Lustosa
1 On Thursday 25 May 2006 14:13, "Bruno Lustosa" <bruno.lists@×××××.com>
2 wrote about '[gentoo-user] Linux Cluster':
3 > - Distributed filesystem, so that all machines can share the same
4 > filesystem. Something like RAID-over-ethernet.
5
6 You probably want RH's GFS (there are probably other cluster-aware
7 filesystems available for linux that I'm not aware of) and some sort of
8 external storage that allows you to hook two machines to it. You might
9 also look into multipathing, that would help in case of a cable failure.
10
11 For maximum availability, you want your enclosure to have two scsi disk
12 controllers, each with two separate scsi ports (these ports are on
13 different chains). You'll hook each of the two computers into cluster to
14 one port on each controller and then use multipathing to tell linux both
15 scsi paths are the same device. You'll have a second external storage
16 connected the same way and software use software mirroring. Then,
17 partition the mirror set (you could also partition at the external
18 storage, but then you have to update the partitions on each storage) and
19 lay GFS down.
20
21 At this point, you don't lose connectivity to your storage if a cable, an
22 hba, an enclosure, a controller, or a computer goes down. Of course, the
23 controllers will handle RAID 5 or RAID 6 so you won't lose even a single
24 path in case of HD failure. GFS should allow concurrent access --
25 possibly even with multiple r/w mounts. ext2/3, jfs, xfs, reiserfs, and
26 even reiser4 are not cluster aware so they will only work properly in the
27 configuration with multiple r/o mounts *OR* a single r/w mount.
28
29 > - Load balancing. Tasks should migrate between nodes.
30
31 HP's ServiceGuard for linux is the only software I know that will do this
32 (for this *sure* there are other commerical solutions), and there is still
33 some small amount of downtime when a task migrates, so they aren't
34 automatically generated.
35
36 Also, some software (IIRC, WebLogic) is able to exist in a clustered
37 environment with some method to sync state across individual nodes
38 (possibly using the clustered FS) so that instead of
39 jobs/packages/daemons/tasks migrating it just runs on all nodes all the
40 time.
41
42 The second option (a cluster-aware program) is usually preferable, because
43 the program itself is better at determining what state needs to be shared,
44 so you get less intra-node communication and less downtime in case a node
45 fails. *However*, an external failover/load-balancer may either be your
46 only solution (if you are already attached to a certain, non-cluster-aware
47 program) or provide better behavior in the case the program is buggy
48 (especially if it's failure mode corrupts and/or brings down other nodes).
49
50 > - Redundancy, so that the death of a machine doesn't take the cluster
51 > or any processes down.
52
53 I believe there's a userland implementation of the CARP protocol that may
54 work for linux. It allows 2 (or more) machines on the same network to
55 share an IP and failover and/or load-balance handling packets directed to
56 that IP.
57
58 > So, anyone doing linux clusters?
59
60 Not personally, but I was looking into them some during my last job.
61 (Trying to get a customer to switch to linux.)
62
63 --
64 "If there's one thing we've established over the years,
65 it's that the vast majority of our users don't have the slightest
66 clue what's best for them in terms of package stability."
67 -- Gentoo Developer Ciaran McCreesh