[Linux-cluster] CS4/ question about load on Heat Beat

Patrick Caulfield pcaulfie at redhat.com
Tue Jan 24 12:38:19 UTC 2006


Alain Moulle wrote:
> Alain Moulle wrote:
> 
> 
>>>Hi
>>>
>>>I wonder which is the strategy in CS4 when
>>>the Heart Beat network is over-loaded for
>>>a while, so much that none of the nodes
>>>have responses on heart beat check.
>>>
>>>Do all nodes in cluster decide to fence reboot
>>>their neighboors and succeed to do it when the
>>>load on network is lessening ?
>>>Or what ?
>>>Do we have any security on this point to
>>>avoid the fence reboot request of CS4 towards
>>>all nodes in the cluster, just because the
>>>network is over-loaded ?
>>>
> 
> 
> 
>>CMAN uses quorum to decide whether it can carry on operating after a cluster
>>split. If more than half of the nodes are still talking to each other then
>>they will have quorum and will fence the remaining nodes.
>>
>>If none of the nodes can see any other node (eg ethernet switch failure) then
>>none of the nodes will have quorum on its own so no fencing will be done.
>>
>>If you subsequently reconnect the nodes after that catastrophe they will all
>>drop out of the cluster as no node can be sure of the state of any other node
>>- to do so would endanger data. So you will need to restart cluster services
>>on all nodes.
>>-- patrick
> 
> 
> Hi
> And thanks Patrick for this detailed answer.
> Bu just a further question : what about the case of cluster
> with only two nodes where the quorum mechanism can't be
> applied : will we be in your second case description too(when
> of the nodes can see any other node) ????

cman has a special "two_node" node which allows the cluster to continue with
only one vote. There is simply a race to see which node gets fenced first!

> Or do both nodes will immediately try to fence one each other ?
> Thanks

yes!

-- 

patrick




More information about the Linux-cluster mailing list