[Linux-cluster] RE: HA Clustering - Need Help

Mon Jan 29 06:52:22 UTC 2007

On 1/25/07, Alan Wood <chekov at ucla.edu> wrote:
>
> some quick comments on your post from someone who has tried an
> active-active cluster on a shared SCSI device.
>
> 1.  If you want to have the same block partition mounted on two different
> computers at the same time, then you need some cluster file system like
> GFS, you can't use ext3.  There are other cluster filesystems out there
> (like lustre) but GFS is most well tied to the RH Cluster Suite and
> designed for high availability as opposed to paralell computing.
> 2.  If you are going to run GFS in a production environment the
> recommendation is to not use 2-node.  GFS 5 required 3 nodes but GFS 6
> offers a 2-node option;  However when using two nodes it is harder to know
> which node is "broken" when something goes wrong, so you'll note a lot of
> discusson on this list about fencing gone awry and needing some sort of
> tiebeaker like a quorum disk.  If you take care in setting it up a 2-node
> cluster will work but you'll want to test it extensively before putting it
> into production.
> 3.  multipathing should work fine and you can build clvm volumes on top of
> multipath devices.  Software RAID is different and not really related.
>
> as for recommendations:
> 1.  don't use SCSI shared storage.  I and others have had reliability
> issues with heavy load in these scenarios.
> 2.  use more than 2 nodes.
> 3.  go active-passive is possible.  as is often pointed out, the entire
> idea of a high availability cluster is that there is enough processing
> horsepower to handle the entire remaining load if one node fails.  in a
> 2-node cluster then you'll have to provision each node to be able to run
> everything.  it is far easier to set it up so that one node therefore runs
> everything and the other node awaits failure than having active-active.
>
> just my $.02
> -alan
>
>
>
Thank you very much for your excellent suggestions and tips but I could not
some of them since I am bound by the specifications laid down by the
development team looking into this project. I have made substantial progress
in this project and a large number of issues have been resolved.  Since it
had to be an Active-Active configuration with  both the nodes accessing the
shared storage at the same time, we have gone for GFS as the file system
using the latest release as suggested by you. The documentation for current
release of RHCS does not talk about any quorum partitions but as suggested
by you, I have left some space partitioned which could be used for the
purpose if need arises. The multipathing is also working fine using the md
driver and we have been able to build logical volumes over the multipath
devices.

I am now dealing with the issue of configuring the network interfaces. As of
now I have configured ethernet bonding on each of the hosts to achieve
network interface redundancy also. However this leads to a lot of network
traffic since the same interfaces are being used for heartbeat / monitoring
also. Therefore, I am thinking of using the two ethernet interfaces
individually, one interface for monitoring and the other one for the LAN
through which the clients will be able to access the hosts. They would be
connected to separate switches and the fence devices would also be on the
monitoring / control network. So I assume that the arrangement would be
something like:

Node A
eth0 - 192.168.100.1
eth1 - 172.16.1.101
fence device - 192.168.100.11

Node B
eth0 - 192.168.100.2
eth1 - 172.16.1.102
fence device - 192.168.100.12

The interfaces eth0 and fence devices would be connected through a switch,
while the other interfaces (eth1) would be on the LAN where clients would be
accessing them. In addition there would be two more floating / shared IP
addresses 172.16.1.201 for the database server and 172.16.1.202 for the
application server which would be defined in the Resources section of
Cluster Configuration Tool and would not be mentioned in /etc/hosts (read
somewhere in the documentation).

Please let me know if these assumptions are correct. I am just wondering how
does the cluster manager figure out which interfaces to use for heartbeat
and monitoring. I haven't seen any such configuration option in the
system-config-cluster program.

The issue which then needs to be resolved is of assigning hostname aliases
to the shared IP addresses since as per the developers, the application
manager and the database need to use a hostname and not an IP address.

Looking forward to your comments,

Thanks a lot.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070129/b11da04d/attachment.htm>