I think your problem it's in your rhedhat cluster fencing, because in my current job i use SAN and we have the some problem, the only workaround it's fence delay in redhat cluster fencing agent <div class="gmail_quote"> 2012/1/26 <<a href="mailto:jayesh.shinde@netcore.co.in">jayesh.shinde@netcore.co.in</a>> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Dear Digimer & Kaloyan Kovachev , Do u think this server shutdown problem ( while fencing simultaneously from both node via drbd.conf) can be completely avoid if I use SAN disk instead of DRBD disk ? i.e in case of SAN disk the defined fence config under cluster.conf will take care of the n/w failuer and related fencing of node ? What you will suggect , SAN or DRBD disk. please guide me. Regards Jayesh Shinde Quoting Digimer <<a href="mailto:linux@alteeve.com" target="_blank">linux@alteeve.com</a>>: > On 01/25/2012 08:57 AM, jayesh.shinde wrote: >> Hi Kaloyan Kovachev , >> >> I am using below config in drbd.conf which is mention on DRBD cookbook. >> >> } >> disk { >> fencing resource-and-stonith; >> } >> handlers { >> outdate-peer "/sbin/obliterate"; >> >> Under /sbin/obliterate script , "fence_node" is mention. >> >> *Do you know what is the default method with "**fence_node $REMOTE" *i.e >> reboot of power-off ? >> >> Dear Digimer , >> >> Can you please guide me here. >> >> Currently I am not having the test machine to test it , so all member's >> inputs will help me a lot to understand it. >> >> Below is the /sbin/obliterate > > I updated the tutorial to address this last night; > > <a href="https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Hooking_DRBD_Into_The_Cluster.27s_Fencing" target="_blank">https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Hooking_DRBD_Into_The_Cluster.27s_Fencing</a> > > and > > <a href="https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Configuring_DRBD_Global_and_Common_Options" target="_blank">https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Configuring_DRBD_Global_and_Common_Options</a> > > In short; this is a problem where the fence device, IPMI and DRAC here, > get the call to shut down their host but don't act on it fast enough to > block the call heading to the other node. > > The obliterate scripts (obliterate is an older version of > obliterate-peer.sh, which I am working to replace with rhcs_fence now) > call cman to remove the peer node from the cluster, then call the actual > fence. For this reason, the delay set in cluster.conf won't help. > > The options are to add a 'sleep 10;' to the start of *one* node's > obliterate or obliterate-peer.sh script. Alternatively, rhcs_fence uses > the node's ID to calculate a delay automatically to help avoid these > dual-fence scenarios. > > -- > Digimer > E-Mail: <a href="mailto:digimer@alteeve.com" target="_blank">digimer@alteeve.com</a> > Papers and Projects: <a href="https://alteeve.com" target="_blank">https://alteeve.com</a> > -- Linux-cluster mailing list <a href="mailto:Linux-cluster@redhat.com">Linux-cluster@redhat.com</a> <a href="https://www.redhat.com/mailman/listinfo/linux-cluster" target="_blank">https://www.redhat.com/mailman/listinfo/linux-cluster</a> </blockquote></div> -- esta es mi vida e me la vivo hasta que dios quiera