[Linux-cluster] rgmanager or clustat problem
David M
diggercheer at gmail.com
Mon Apr 9 17:22:26 UTC 2007
I am running a four node GFS cluster with about 20 services per node. All
four nodes belong to the same failover domain, and they each have a priority
of 1. My shared storage is an iSCSI SAN.
After rgmanager has been running for a couple of days, clustat produces the
following result on all four nodes:
Timed out waiting for a response from Resource Group Manager
Member Status: Quorate
Member Name Status
------ ---- ------
node01 Online, rgmanager
node02 Online, Local, rgmanager
node03 Online, rgmanager
node04 Online, rgmanager
I also get a time out when I try to determine the status of a particular
service with "clustat -s servicename".
All of the services seem to be up and running, but clustat does not work.
Is there something wrong? Is there a way for me to increase the time out?
clurgmgrd and dlm_recvd seem to be using a lot of CPU cycles on Node02, 40
and 60 percent, respectively.
Thank you for your help.
cman_tool services:
NODE01:
Service Name GID LID State Code
Fence Domain: "default" 4 2 run -
[1 3 2 4]
DLM Lock Space: "clvmd" 1 3 run -
[1 3 2 4]
DLM Lock Space: "Magma" 3 5 run -
[1 3 2 4]
DLM Lock Space: "gfslv" 5 6 run -
[2 1 3 4]
GFS Mount Group: "gfslv" 6 7 run -
[2 1 3 4]
User: "usrm::manager" 2 4 run -
[1 3 2 4]
NODE02:
Service Name GID LID State Code
Fence Domain: "default" 4 5 run -
[1 3 2 4]
DLM Lock Space: "clvmd" 1 1 run -
[1 3 2 4]
DLM Lock Space: "Magma" 3 3 run -
[1 3 2 4]
DLM Lock Space: "gfslv" 5 6 run -
[1 4 2 3]
GFS Mount Group: "gfslv" 6 7 run -
[1 4 2 3]
User: "usrm::manager" 2 2 run -
[1 3 2 4]
NODE03:
Service Name GID LID State Code
Fence Domain: "default" 4 2 run -
[1 2 3 4]
DLM Lock Space: "clvmd" 1 3 run -
[1 2 3 4]
DLM Lock Space: "Magma" 3 5 run -
[1 2 3 4]
DLM Lock Space: "gfslv" 5 6 run -
[1 2 4 3]
GFS Mount Group: "gfslv" 6 7 run -
[1 2 4 3]
User: "usrm::manager" 2 4 run -
[1 2 3 4]
NODE04:
Service Name GID LID State Code
Fence Domain: "default" 4 2 run -
[1 2 3 4]
DLM Lock Space: "clvmd" 1 3 run -
[1 2 3 4]
DLM Lock Space: "Magma" 3 5 run -
[1 2 3 4]
DLM Lock Space: "gfslv" 5 6 run -
[1 4 2 3]
GFS Mount Group: "gfslv" 6 7 run -
[1 4 2 3]
User: "usrm::manager" 2 4 run -
[1 2 3 4]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070409/724f5946/attachment.htm>
More information about the Linux-cluster
mailing list