[Linux-cluster] ccsd problems

Lon Hohberger lhh at redhat.com
Mon Jan 15 14:44:41 UTC 2007


On Fri, 2007-01-12 at 15:38 -0500, Andre Henry wrote:

> In any event it working now. If something is stuck in recover is  
> there anyway to flush it out w/o rebooting ?

It depends on what it is.  Rgmanager used to cause things to get stuck
in the recover state - but that's been fixed in CVS and with other
packages I've posted to linux-cluster.

fenced will get stuck in 'recover' until fencing completes; if fencing
is broken, you can unstick it by figuring out which node is fencing the
other(s), logging in, and doing something like the following if fencing
is failing (warning, untested):

mv /sbin/my_fence_agent /sbin/my_fence_agent.bak
ln -sf /bin/true /sbin/my_fence_agent
sleep 30
rm /sbin/my_fence_agent
mv /sbin/my_fence_agent.bak /sbin/my_fence_agent

I have a patch which will be going in to head soon which gives you a
manual override if fencing fails, but until it's there, the above is
basically a way to trick fenced to think fencing has completed.
Obviously, don't do this until you've manually shut down the node or
verified that it is down (i.e. 'ping' is not adequate).

-- Lon




More information about the Linux-cluster mailing list