<div> </div>
<div>I am running a four node GFS cluster with about 20 services per node. All four nodes belong to the same failover domain, and they each have a priority of 1. My shared storage is an iSCSI SAN.</div>
<div> </div>
<div>After rgmanager has been running for a couple of days, clustat produces the following result on all four nodes:</div>
<div>
<p>Timed out waiting for a response from Resource Group Manager<br>Member Status: Quorate</p>
<p> Member Name Status<br> ------ ---- ------<br> node01 Online, rgmanager<br> node02 Online, Local, rgmanager<br> node03 Online, rgmanager
<br> node04 Online, rgmanager</p>
<p>I also get a time out when I try to determine the status of a particular service with "clustat -s servicename".</p>
<p>All of the services seem to be up and running, but clustat does not work. Is there something wrong? Is there a way for me to increase the time out?</p>
<p>clurgmgrd and dlm_recvd seem to be using a lot of CPU cycles on Node02, 40 and 60 percent, respectively. </p>
<p>Thank you for your help.</p>
<p>cman_tool services:</p>
<p>NODE01:</p>
<p>Service Name GID LID State Code<br>Fence Domain: "default" 4 2 run -<br>[1 3 2 4]</p>
<p>DLM Lock Space: "clvmd" 1 3 run -<br>[1 3 2 4]</p>
<p>DLM Lock Space: "Magma" 3 5 run -<br>[1 3 2 4]</p>
<p>DLM Lock Space: "gfslv" 5 6 run -<br>[2 1 3 4]</p>
<p>GFS Mount Group: "gfslv" 6 7 run -<br>[2 1 3 4]</p>
<p>User: "usrm::manager" 2 4 run -<br>[1 3 2 4]<br></p>
<p>NODE02:<br>Service Name GID LID State Code<br>Fence Domain: "default" 4 5 run -<br>[1 3 2 4]</p>
<p>DLM Lock Space: "clvmd" 1 1 run -<br>[1 3 2 4]</p>
<p>DLM Lock Space: "Magma" 3 3 run -<br>[1 3 2 4]</p>
<p>DLM Lock Space: "gfslv" 5 6 run -<br>[1 4 2 3]</p>
<p>GFS Mount Group: "gfslv" 6 7 run -<br>[1 4 2 3]</p>
<p>User: "usrm::manager" 2 2 run -<br>[1 3 2 4]<br></p></div>
<div>
<p>NODE03:<br>Service Name GID LID State Code<br>Fence Domain: "default" 4 2 run -<br>[1 2 3 4]</p>
<p>DLM Lock Space: "clvmd" 1 3 run -<br>[1 2 3 4]</p>
<p>DLM Lock Space: "Magma" 3 5 run -<br>[1 2 3 4]</p>
<p>DLM Lock Space: "gfslv" 5 6 run -<br>[1 2 4 3]</p>
<p>GFS Mount Group: "gfslv" 6 7 run -<br>[1 2 4 3]</p>
<p>User: "usrm::manager" 2 4 run -<br>[1 2 3 4]</p>
<p>NODE04:<br>Service Name GID LID State Code<br>Fence Domain: "default" 4 2 run -<br>[1 2 3 4]</p>
<p>DLM Lock Space: "clvmd" 1 3 run -<br>[1 2 3 4]</p>
<p>DLM Lock Space: "Magma" 3 5 run -<br>[1 2 3 4]</p>
<p>DLM Lock Space: "gfslv" 5 6 run -<br>[1 4 2 3]</p>
<p>GFS Mount Group: "gfslv" 6 7 run -<br>[1 4 2 3]</p>
<p>User: "usrm::manager" 2 4 run -<br>[1 2 3 4]<br></p></div>