[Linux-cluster] Power based fencing in cluster causes single point of failure that can take down a cluster

Jonathan Biggar jon at levanta.com
Tue Jan 9 18:50:53 UTC 2007


If we set up a cluster and use network power switches for fencing, won't 
the failure of the power switch attached to a cluster member cause all 
services that were running on that node to fail to migrate to other 
cluster members?

This seems to happen to us in practice, because fencing the offline 
member fails due to the power switch being unavailable, so rgmanager 
never migrates the failed service(s) to another member.

Is there a general solution to this problem that I'm missing?

-- 
Jon Biggar
Levanta
jon at levanta.com
650-403-7252




More information about the Linux-cluster mailing list