[Linux-cluster] Node with failed service does not get fenced.

Tue Jul 22 20:10:18 UTC 2008

On Mon, 2008-07-21 at 23:35 +0200, Jonas Helgi Palsson wrote:
> Hi
> 
> Running CentOS 5.2, all current updates on x86_64 platform.
> 
> I have set up a 2node cluster with following resources in one service
> 
> * one shared MD device (the resource is a script that assembles and stops 
> the , device and checks its status).
> * one shared filesystem,
> * one shared NFS startup script,
> * one shared ip.
> 
> Which are started in that order.
> 
> And the cluster works normaly, I can move the service between the two nodes.
> 
> But I have observed one behavior that is not good. Once when trying to move 
> the service from one node to another, the clustermanager could not "umount" 
> the filesystem. 
> Although "lsof | grep <mountpoint>" did not show anything, "umount -f 
> <mountpoint>" did not work. ("umount -l <mountpoint>" did the job)

> Are there any magical things one can put in cluster.conf to get the behavior I 
> want? That if a service does not want to stop cleanly, fence the node and 
> start the service on another node?

Add self_fence="1" to the <fs> resource.

-- Lon