[Linux-cluster] how to disable one node

Mon Jul 11 13:11:44 UTC 2011

I'm not sure if you can access this doc (I think it requires a
login account at RHN),
and if this addresses your issue?

In RHN Knowledge base there is this article entitled 
"How do I disable the cluster software on a member system in Red
Hat Enterprise Linux?"

https://access.redhat.com/kb/docs/DOC-5695

Good Luck
Ralph
________________________________

	From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Helen
Heath
	Sent: Wednesday, July 06, 2011 2:14 PM
	To: linux-cluster at redhat.com
	Subject: [Linux-cluster] how to disable one node

	Hi all -

	I hope someone can shed some light on this.  I have a
2-node cluster running on RedHat 3 which has a shared /clust1
filesystem and is connected to a network power switch.  There is
something very wrong with the cluster, as every day currently it
is rebooting whichever is the primary node, for no reason I can
track down.  No hardware faults anywhere in the cluster, no
failures of any kind logging in any log files, etc etc.   It
started out well over a year ago rebooting the primary node every
other week, then across time it progressed to once a week, then
once a day.  I logged a call with RedHat way back when it first
started; nothing was ever found to be the problem, and of course
in time, RedHat v3 went out of support and they would no longer
assist in troubleshooting the issue.  Prior to this problem
starting the cluster had been running happily with no issues for
about 5 years.

	Now this cluster is shortly being replaced with new
hardware and RedHat 5, so hopefully whatever is the problem will
as mysteriously vanish as it appeared.  However, I need to stop
this daily reboot as it is playing havoc with the application
that runs on this system (a heavily-utilised database) and having
tried everything I can think of, I decided to 'break' the
cluster; ie, take down one node so that only one node remains
running the application.

	I cannot find a way to do this that persists across a
reboot of the node that should be out of the cluster.  I've run
"/sbin/chkconfig --del clumanager" and it did take the service
out of chkconfig (I verified this).  The RedHat document
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/3/html
/Cluster_Administration/s1-admin-disable.html seems to indicate
this should persist across a reboot - ie, you reboot the node and
it does not attempt to rejoin the cluster; however, this didn't
work!  The primary node cluster monitoring software saw that the
secondary node was down, STONITH kicked in, the NPS powered the
port this node is connected to off and back on, the secondary
node rebooted and rejoined the cluster!

	Does anyone know how to either temporarily remove the
secondary node from the cluster in such a way that persists
across reboots but can be easily brought back into the cluster
when needed, or else (and preferably) how to temporarily stop the
cluster monitoring software running on the primary node from even
looking out for the secondary node - as in, it doesn't care
whether the secondary node is up or not?  I've checked for the
period the secondary node is down that the primary node is quite
happy to carry on processing as usual but as soon as the cluster
monitoring software on the primary node realises the secondary
node is down, it reboots it, and I'm back to square one!

	This is now really urgent (I've been trying to find an
answer to this for some weeks now) as I go on holiday on Friday
and I really don't want to leave my second-in-command with a mess
on his hands!

	thanks

	-- 
	Helen Heath
	helen_heath at fastmail.fm
	=*=
	Everything that has a beginning has an ending. Make your
peace with that and all will be well.
	-- Buddhist saying