[Linux-cluster] problem with rejoining a node

Javi Polo javipolo at datagrama.net
Mon Aug 8 21:55:30 UTC 2005


On Aug/08/2005, Patrick Caulfield wrote:

> What sort of fencing are you using? If it's a power-switch fence then the
> node should be hard rebooted. If it's SAN fencing then you'll have to get the

it's san fencing. I modified the fence_sanbox2.pl script to suit the
switch commands, and "by hand" it works perfectly

> node out of the cluster - the remaining two nodes /should/ tell it it leave the
> cluster.

so, when the node recovers and "says hello" to the cluster, the other
two do take him out of the cluster?

> A node can't just "rejoin" a cluster after being SAN fenced. it must be removed
> from the cluster and rejoin from scratch. There's far too much state involved
> for it to merge  seamlessly back into a cluster.

must i do it manually? or is any kind of automated process here?

what are the steps the node should perform after being fenced so it can
join again the node?
(sorry asking so much but I'm really lost here :/ )

> > gfstest1:~# cman_tool join
> > cman_tool: Node is already active
> > gfstest1:~# cman_tool leave
> > cman_tool: Can't leave cluster while there are 5 active subsystems
> cman_tool leave force will force it to leave, but you might find it still needs
> a reboot to clear the filesystems.

so, if we just simply loose conectivity between nodes, we should still
reboot the server so it can be "clean" and join again the cluster?

and if so, should I enable manually the port on the SAN, or will fenced
do it for me (as the script does actually accepts an enable parameter)
:??

-- 
Javier Polo @ Datagrama
902 136 126




More information about the Linux-cluster mailing list