[Linux-cluster] rgmanager or clustat problem

Lon Hohberger lhh at redhat.com
Mon Apr 9 20:16:16 UTC 2007


On Mon, Apr 09, 2007 at 12:22:26PM -0500, David M wrote:
> I am running a four node GFS cluster with about 20 services per node.  All
> four nodes belong to the same failover domain, and they each have a priority
> of 1.  My shared storage is an iSCSI SAN.
> 
> After rgmanager has been running for a couple of days, clustat produces the
> following result on all four nodes:
> 
> Timed out waiting for a response from Resource Group Manager
> Member Status: Quorate
> 
>  Member Name                              Status
>  ------ ----                              ------
>  node01           Online, rgmanager
>  node02           Online, Local, rgmanager
>  node03           Online, rgmanager
>  node04           Online, rgmanager
> 
> I also get a time out when I try to determine the status of a particular
> service with "clustat -s servicename".
> 
> All of the services seem to be up and running, but clustat does not work.
> Is there something wrong?  Is there a way for me to increase the time out?
> 
> clurgmgrd and dlm_recvd seem to be using a lot of CPU cycles on Node02, 40
> and 60 percent, respectively.

What version of rgmanager do you have installed?  It sounds like you're
hitting #212644, which is fixed with packages available here:

http://people.redhat.com/lhh/packages.html

(It will also be fixed in the next Red Hat update, which will then
trickle down to CentOS, I suspect)

-- Lon

-- 
Lon Hohberger - Software Engineer - Red Hat, Inc.




More information about the Linux-cluster mailing list