[Linux-cluster] CS4/ question about load on Heat Beat

Alain Moulle Alain.Moulle at bull.net
Tue Jan 24 12:09:22 UTC 2006


Alain Moulle wrote:

>> Hi
>>
>> I wonder which is the strategy in CS4 when
>> the Heart Beat network is over-loaded for
>> a while, so much that none of the nodes
>> have responses on heart beat check.
>>
>> Do all nodes in cluster decide to fence reboot
>> their neighboors and succeed to do it when the
>> load on network is lessening ?
>> Or what ?
>> Do we have any security on this point to
>> avoid the fence reboot request of CS4 towards
>> all nodes in the cluster, just because the
>> network is over-loaded ?
>>


> CMAN uses quorum to decide whether it can carry on operating after a cluster
> split. If more than half of the nodes are still talking to each other then
> they will have quorum and will fence the remaining nodes.
>
> If none of the nodes can see any other node (eg ethernet switch failure) then
> none of the nodes will have quorum on its own so no fencing will be done.
>
> If you subsequently reconnect the nodes after that catastrophe they will all
> drop out of the cluster as no node can be sure of the state of any other node
> - to do so would endanger data. So you will need to restart cluster services
> on all nodes.
> -- patrick

Hi
And thanks Patrick for this detailed answer.
Bu just a further question : what about the case of cluster
with only two nodes where the quorum mechanism can't be
applied : will we be in your second case description too(when
of the nodes can see any other node) ????
Or do both nodes will immediately try to fence one each other ?
Thanks
Alain Moullé

-- 



mailto:Alain.Moulle at bull.net
+------------------------------+--------------------------------+
|	Alain Moullé	       	| from France :	04 76 29 75 99  |
|                              	| FAX number  : 04 76 29 72 49  |
| Bull SA		       	|				|
| 1, Rue de Provence  		| Adr  : FREC B1-041            |
| B.P. 208			|				|
| 38432 Echirolles - CEDEX     	| Email: Alain.Moulle at bull.net  |
| France                       	| BCOM : 229 7599               |
+-------------------------------+-------------------------------+





More information about the Linux-cluster mailing list