[Linux-cluster] RH Cluster issue-Network Failover not happening

Sun Dec 23 11:19:35 UTC 2007

For the below issue some additional information: Network cable I disconnect
is from the public interface. Am I doing the correct way to test the
cluster? Can any one explain if the public interface is removed from primary
(Node 1) will the secondary (Node 2) come to know that primary is down?

1. Can you check your Ethernet switch (if it is a managed one) whether the
multicast is turned on? Ans: It's not a managed switch.

-----Original Message-----
From: Harun [mailto:harun at mhd.co.om] 
Sent: Sunday, December 23, 2007 8:16 AM
To: 'linux clustering'
Subject: RH Cluster issue-Network Failover not happening

Issue: When network cable is disconnected from the Primary, primary restart
unclean and the failover to secondary do not happens. The shared drives
don't get mounted automatically for secondary neither gets it mounted on
primary, after the primary restarts. I have to then manually shut down both
Primary and Secondary, and start primary first and then secondary for the
setup to work fine again.

I want to test a live production setup... a Linux Cluster with 2 nodes, in
Linux Advanced Server (Linux DB-Primary 2.4.21-37.ELsmp #1 SMP Wed Sep 7
13:28:55 EDT 2005 i686 i686 i386 GNU/Linux ), Oracle data base is running on
this setup.

The clumanager version is 1.2.28 and redhat-config-cluster version is 1.0.8
on both primary and secondary.I want to resolve the issue with out any
upgradations. Do you think that updating can resolve the issue? If
upgradation is required please guide how to go ahead. I am trying to resolve
this issue with out any patch update.
Is this a configuration problem or some known issue with the version used?

Cluster.xml looks like this.

  <?xml version="1.0" ?> 
- <cluconfig version="3.0">
  <clumembd broadcast="no" interval="1000000" loglevel="5" multicast="yes"
multicast_ipaddress="225.0.0.11" thread="yes" tko_count="25" /> 
  <cluquorumd loglevel="5" pinginterval="2" tiebreaker_ip="" /> 
  <clurmtabd loglevel="5" pollinterval="4" /> 
  <clusvcmgrd loglevel="5" /> 
  <clulockd loglevel="5" /> 
  <cluster config_viewnumber="74" key="e307059f0bed8596868c4ab99818dc5d"
name="Oracle_cluster" /> 
  <sharedstate driver="libsharedraw.so" rawprimary="/dev/raw/raw1"
rawshadow="/dev/raw/raw2" type="raw" /> 
- <members>
  <member id="0" name="DB-Primary" watchdog="yes" /> 
  <member id="1" name="DB-Secondary" watchdog="yes" /> 
  </members>
- <services>
- <service checkinterval="0" failoverdomain="None" id="0" maxfalsestarts="3"
maxrestarts="0" name="AOFSPROD" userscript="">
- <service_ipaddresses>
  <service_ipaddress broadcast="None" id="0" ipaddress="192.168.0.7"
monitor_link="0" netmask="255.255.255.0" /> 
  </service_ipaddresses>
- <device id="0" name="/dev/sdb3" sharename="">
  <mount forceunmount="yes" fstype="ext3" mountpoint="/data1" options="" /> 
  </device>
- <device id="1" name="/dev/sdb4" sharename="">
  <mount forceunmount="yes" fstype="ext3" mountpoint="/data2" options="" /> 
  </device>
- <device id="2" name="/dev/sdc1" sharename="">
  <mount forceunmount="yes" fstype="ext3" mountpoint="/backup" options="" />

  </device>
- <device id="3" name="/dev/sdd1" sharename="">
  <mount forceunmount="yes" fstype="ext3" mountpoint="/archive" options=""
/> 
  </device>
  </service>
  </services>
  <failoverdomains /> 
  </cluconfig>

<<<<   Disclaimer Message  >>>>
"This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the named addressee, please notify the sender immediately after deleting this e-mail from your system and do not disseminate, distribute or copy this e-mail. The sender does not accept liability for any errors or omissions in the contents of this message, which arise as a result of erroneous e-mail transmission."
[Mohsin Haider Darwish LLC & Group Companies, PO.Box 880, Ruwi-112, Oman]