[Linux-cluster] two fencing problems

Greg Forte gforte at leopard.us.udel.edu
Tue Dec 20 15:21:22 UTC 2005


Lon Hohberger wrote:
> On Wed, 2005-12-07 at 10:08 -0500, Greg Forte wrote:
> 
> 
>>                                         <device name="FENCE1" 
>>option="reboot" port="1"/>
>>                                         <device name="FENCE2" 
>>option="reboot" port="1"/>
>>
>>and increased the reboot wait time on the PDUs to make sure it'd wait 
>>long enough, and that SEEMS to work (once I remembered to turn off ccsd 
>>before updating my cluster.conf by hand so that it didn't end up 
>>replacing it with the old one immediately ;-)
> 
> 
> I don't know how I missed this, but this is a poor idea.
> 
> What if fenced hangs in the middle?  Then you haven't turned off the
> power at all, but the cluster thinks you did!  Goodbye, file systems!
> 
> There's no way to guarantee that both ports were turned off
> simultaneously, irrespective of the timeout values. :(
> 
> You could do:
> 
>    <device name="FENCE1" option="off" port="1"/>
>    <device name="FENCE2" option="reboot" port="1"/>
>    <device name="FENCE1" option="on" port="1"/>
> 
> ...but that's about as "optimal" as you can get while still being safe.

Sure sure ... except any multiple sequence of commands to the same fence
device doesn't work (per the bug that David Tiegland dug up somewhere in
this thread).  ;-)

-g




More information about the Linux-cluster mailing list