[Linux-cluster] RE: HA Clustering - Need Help

Wed Jan 24 23:46:24 UTC 2007

some quick comments on your post from someone who has tried an 
active-active cluster on a shared SCSI device.

1.  If you want to have the same block partition mounted on two different 
computers at the same time, then you need some cluster file system like 
GFS, you can't use ext3.  There are other cluster filesystems out there 
(like lustre) but GFS is most well tied to the RH Cluster Suite and 
designed for high availability as opposed to paralell computing.
2.  If you are going to run GFS in a production environment the 
recommendation is to not use 2-node.  GFS 5 required 3 nodes but GFS 6 
offers a 2-node option;  However when using two nodes it is harder to know 
which node is "broken" when something goes wrong, so you'll note a lot of 
discusson on this list about fencing gone awry and needing some sort of 
tiebeaker like a quorum disk.  If you take care in setting it up a 2-node 
cluster will work but you'll want to test it extensively before putting it 
into production.
3.  multipathing should work fine and you can build clvm volumes on top of 
multipath devices.  Software RAID is different and not really related.

as for recommendations:
1.  don't use SCSI shared storage.  I and others have had reliability 
issues with heavy load in these scenarios.
2.  use more than 2 nodes.
3.  go active-passive is possible.  as is often pointed out, the entire 
idea of a high availability cluster is that there is enough processing 
horsepower to handle the entire remaining load if one node fails.  in a 
2-node cluster then you'll have to provision each node to be able to run 
everything.  it is far easier to set it up so that one node therefore runs 
everything and the other node awaits failure than having active-active.

just my $.02
-alan

> Date: Tue, 23 Jan 2007 23:03:58 +0530
> From: "Net Cerebrum" <netcerebrum at gmail.com>
> Subject: [Linux-cluster] HA Clustering - Need Help
> To: linux-cluster at redhat.com
>
> Hello All,
>
> I am totally new to HA clustering and am trying hard to grasp the
> fundamentals in a limited time frame. I have been asked by my company to
> create a high availability cluster using Red Hat Cluster Suite on hardware
> comprising two servers running RHEL AS 4 and one shared external storage
> array. The cluster would be running in Active-Active state. Oracle Database
> version 9 (not RAC) would run on one of the servers while the Oracle
> Applications version 11 would run on the other. In case of failure of either
> of the servers, the service would be started on the other server. Both the
> servers (nodes) would be connected to the storage array through two
> redundant SCSI controllers.
>
> Since the storage has redundant controllers, both the servers would be
> connected to the storage array using two channels each and the requirement
> is to make it an Active-Active Load Balanced configuration using a multipath
> software. The storage vendor has suggested using the multipath option with
> the mdadm software for creating multipath devices on the storage array.
>
> I have gone through the manuals and since this is my first attempt at high
> availabilty clustering I have many doubts and questions. What file system
> should be used on the external storage ? Is it better to use ext3 or Red Hat
> GFS ? At certain places it is mentioned that GFS should be used only if the
> number of nodes is 3 or more and GULM is being used. Since we have only two
> nodes, we plan to use DLM.  It is also mentioned that GFS and CLVM may not
> work on a software RAID device. Would the multipath devices created
> (/dev/md0, /dev/md1, etc) be considered to be software RAID devices, though
> in the real sense they are not ? Further the development team is not too
> sure about the compatibility between GFS and Oracle Database and
> Applications. What could be the pros and cons of using  ext3 file system in
> this scenario ?
>
> The development team just wants one filesystem to be used on the storage
> which would be mounted as /oracle on both the servers / nodes and all the
> binaries and data would reside on this. Since this filesystem is going to be
> mounted at boot time, my understanding is that no mounting or unmounting of
> any filesystem will take place during the failover so the cluster
> configuration should reflect that. The documentation repeatedly refers to
> mounting of the file systems when failover takes place so that's giving rise
> to a little confusion. Further there are references to a quorum partition in
> documentation but I have not been able to find any provision to make use of
> the same in the cluster configuration tool.
>
> Please help me in clarifying these issues and suggest me how to go about
> setting this cluster. I would be really grateful for any suggestions and
> references.
>
> Thanks,