[Linux-cluster] random STONITHS

danwest at comcast.net danwest at comcast.net
Mon Dec 12 18:59:24 UTC 2005


I have a ten node cluster based on the latest RHEL3 patched source.  5 of the 10 nodes are spares.  The last member “members%members9%name = node10” tends to randomly STONITH one of the other nodes for no apparent reason.  These STONITHs happen to both the active and spare nodes.  The only thing I can find in common is that it is always the last device to issue the STONITH.  The nodes are not particularly loaded.  Does anyone have any ideas why this may be happening?
 
Here is an example of /var/log/messages
 
messages.1:Dec  5 02:50:11 node10 cluquorumd[5168]: <warning> --> Commencing STONITH <--
messages.1:Dec  5 02:50:17 node10 cluquorumd[5168]: <notice> STONITH: node3 has been fenced!
messages.1:Dec  5 02:50:23 node10 cluquorumd[5168]: <notice> STONITH: node3 is no longer fenced off.
messages.1:Dec  5 05:29:32 node10 cluquorumd[5168]: <warning> --> Commencing STONITH <--
messages.1:Dec  5 05:29:39 node10 cluquorumd[5168]: <notice> STONITH: node3 has been fenced!
messages.1:Dec  5 05:29:45 node10 cluquorumd[5168]: <notice> STONITH: node3 is no longer fenced off.
 
Thanks,
 Dan
 




More information about the Linux-cluster mailing list