[Linux-cluster] problem with rejoining a node

Patrick Caulfield pcaulfie at redhat.com
Tue Aug 9 12:25:08 UTC 2005


Javi Polo wrote:
> On Aug/08/2005, Patrick Caulfield wrote:
> 
> 
>>What sort of fencing are you using? If it's a power-switch fence then the
>>node should be hard rebooted. If it's SAN fencing then you'll have to get the
> 
> 
> it's san fencing. I modified the fence_sanbox2.pl script to suit the
> switch commands, and "by hand" it works perfectly
> 
> 
>>node out of the cluster - the remaining two nodes /should/ tell it it leave the
>>cluster.
> 
> 
> so, when the node recovers and "says hello" to the cluster, the other
> two do take him out of the cluster?

Yes. Is that not happening ?

> 
>>A node can't just "rejoin" a cluster after being SAN fenced. it must be removed
>>from the cluster and rejoin from scratch. There's far too much state involved
>>for it to merge  seamlessly back into a cluster.
> 
> 
> must i do it manually? or is any kind of automated process here?
> 
> what are the steps the node should perform after being fenced so it can
> join again the node?
> (sorry asking so much but I'm really lost here :/ )

A reboot is usually the easiest way to ensure that a node is "clean". If you
can umount all the GFS filesystems and stop all the cluster subsystems (fence,
clvmd, gfs) then you should be able to run the startup scripts again but it's
just a faff.

>>>gfstest1:~# cman_tool join
>>>cman_tool: Node is already active
>>>gfstest1:~# cman_tool leave
>>>cman_tool: Can't leave cluster while there are 5 active subsystems
>>
>>cman_tool leave force will force it to leave, but you might find it still needs
>>a reboot to clear the filesystems.
> 
> 
> so, if we just simply loose conectivity between nodes, we should still
> reboot the server so it can be "clean" and join again the cluster?
> 
> and if so, should I enable manually the port on the SAN, or will fenced
> do it for me (as the script does actually accepts an enable parameter)
> :??
> 

I don't know. I've never had access to a SAN fencing device!

-- 

patrick




More information about the Linux-cluster mailing list