[Linux-cluster] rgmanager gets stuck on shutdown, if no services are running on its node.
Jankowski, Chris
Chris.Jankowski at hp.com
Thu Dec 9 06:58:41 UTC 2010
Lon,
I think that I got to the bottom of the problem:
If there are *no* services running on a node and you issue "shutdown -h now" on the node, then when it comes to shutting down rgmanger, it executes the following sequence:
1. Outputs "Shutting down" message to /var/adm/messages
2. Waits for the "status_poll_interval" value of seconds
3. Outputs the message: "Shutdown complete, exiting" and completes its own shutdown.
In my case, I had <rm status_poll_interval="3600"/>, as my service scripts do not have a viable check of their status, and the status check messages were clogging up the /var/adm/messages file. So, rgmanager appeared to be stuck, whereas it was just really waiting.
I think this is a bug in logic here. It should not be waiting in this situation.
------------
By comparison, if there is a service running on a node and you issue "shutdown -h now" on the node, then when it comes to shutting down rgmanger, it executes the following sequence:
1. Outputs "Shutting down" message to /var/adm/messages
2. Proceeds *immediately* (no wait) to shutting down the service
3. When the service is shutdown the rgmanager *immediately* outputs "Shutdown complete, exiting" and completes its own shutdown.
-------------
As a workaround, I set status_poll_interval="10" for the time being, although I believe that I should be forced to rely on short polling interval.
Regards,
Chris Jankowski
-----Original Message-----
From: Jankowski, Chris
Sent: Thursday, 9 December 2010 16:08
To: linux clustering
Subject: RE: [Linux-cluster] rgmanager gets stuck on shutdown, if no services are running on its node.
Lon,
The problem is reproducible at will. I do have access to the system after the "shutdown -h now" command is issued and rgmanager blocks.
I have gdb installed, but I do not know how to obtain rgmanager-debuginfo. The system is on an isolated network and I pointed you to an on-disk repository that is a copy of the RHEL6 distribution DVD copied to local disk.
Thanks and regards,
Chris
-----Original Message-----
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Lon Hohberger
Sent: Thursday, 9 December 2010 06:46
To: linux clustering
Subject: Re: [Linux-cluster] rgmanager gets stuck on shutdown, if no services are running on its node.
On Wed, 2010-12-08 at 03:11 +0000, Jankowski, Chris wrote:
> Hi,
>
> I configured a cluster of 2 RHEL6 nodes.
> The cluster has only one HA service defined.
>
> I have a problem with rgmanager getting stuck on shutdown when certain
> set of conditions are met. The details follow.
>
> 1.
> If I execute “shutdown –h now” on the node that is *not* running the
> HA service then the shutdown process gets stuck with the last message
> in the /var/log/messages being:
>
Is this reproducible outside of 'shutdown -h now', ex: does 'service
rgmanager stop' work in your configuration?
If you can still reach the machine (ssh or whatever) after executing
'shutdown -h now':
1) Install 'rgmanager-debuginfo' and gdb.
2) When rgmanager hangs on shutdown, run:
- gdb /usr/sbin/rgmanager `pidof -s rgmanager`
3) When inside gdb, run:
- thr a a bt
There's a related bug in RHEL5 related to releasing the lockspace if
CMAN exits before rgmanager, but I was unable to reproduce it on the
STABLE3/31 branches when I tested.
-- Lon
--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
More information about the Linux-cluster
mailing list