[Linux-cluster] CS4 update 2 / pb of clustat stalled (contd.)

Alain Moulle Alain.Moulle at bull.net
Thu Jul 20 06:26:39 UTC 2006


Back to the problem of clustat sometimesremaining stalled ...

Context :
Two nodes :
node 1 with an active service
node 2 as backup

Test :
I do poweroff on node 1 to force the failover on node 2
Just after the poweroff on node 1, I launch clustat
every second on node 2.

Behavior :
After the 21s to detect the missing heart beat,
I've checked that clustat is stalled in rg_state_list (called
by msg_receive_simple) during all the time the service which
was active on node1 is re-launched on node 2.
So depending on the time the application takes to be
successfully launched, clustat could remain stalled
for a while.
Is it a normal behavior ?
Or is there a bug, and if so, is it already
fixed in next CS4 updates ?

And in the case the application fails at start, could
clustat be definitely stalled ?

Thanks for your thoughts about that.
Alain Moullé

More information about the Linux-cluster mailing list