[Linux-cluster] cluster is not quorate. refusing connection

Matthew B. Brookover mbrookov at mines.edu
Fri Dec 10 22:16:01 UTC 2004


On Fri, 2004-12-10 at 00:17, David Teigland wrote:
> This sounds similar to a problem I have if I run fence_tool without ccsd
> running.
> 
> Check /proc/cluster/status while it's waiting to see if the cluster
> actually has quorum or not.  Also, I've added some extra checking and
> debugging to fence_tool that should help narrow down where things are
> stuck.  Please update from cvs and rebuild at least the stuff in
> cluster/fence; then use "fence_tool join -D".
> 
> Usually things get stuck talking to ccs when ccs/magma libraries are out
> of sync, but this case sounds different.

Ok, I pulled the updates from CVS and rebuilt the code and the kernel.

On node fiveoften, fence_tool printed out some errors and exited with
status 1.  The errors are below.

On node fouroften, fence_tool did not print any messages and it did
not exit.  I am guessing that fence_tool did not exit because of a
feature of the -D flag.  Fence_tool did startup fenced on fouroften.

Ccsd started up and is running on both nodes.

According to /proc/cluster/status and nodes on both fouroften and
fiveoften, the cluster is up and has quorum.

fence_tool printed these messages on node fiveoften:
+ fence_tool join -D
fence_tool: cannot connect to ccs -111

fence_tool: wait for quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: get our node name
fence_tool: connect to ccs
fence_tool: waiting for ccs connection -111
fence_tool: waiting for ccs connection -111
fence_tool: waiting for ccs connection -111
fence_tool: waiting for ccs connection -111
fence_tool: waiting for ccs connection -111
fence_tool: waiting for ccs connection -111
fence_tool: waiting for ccs connection -111
fence_tool: waiting for ccs connection -111
fence_tool: waiting for ccs connection -111
fence_tool: waiting for ccs connection -111

Log entries on node fiveoften:
Dec 10 14:08:39 fiveoften kernel: Lock_Harness <CVS> (built Dec 10 2004
09:14:45) installed
Dec 10 14:08:39 fiveoften kernel: GFS <CVS> (built Dec 10 2004 09:14:04)
installed
Dec 10 14:08:39 fiveoften kernel: CMAN <CVS> (built Dec 10 2004
09:51:59) installed
Dec 10 14:08:39 fiveoften kernel: NET: Registered protocol family 30
Dec 10 14:08:39 fiveoften kernel: DLM <CVS> (built Dec 10 2004 09:52:25)
installed
Dec 10 14:08:39 fiveoften kernel: Lock_DLM (built Dec 10 2004 09:14:25)
installed
Dec 10 14:08:40 fiveoften kernel: CMAN: Waiting to join or form a
Linux-cluster
Dec 10 14:09:11 fiveoften kernel: CMAN: sending membership request
Dec 10 14:09:11 fiveoften kernel: CMAN: got node fouroften
Dec 10 14:09:11 fiveoften kernel: CMAN: quorum regained, resuming
activity
Dec 10 14:09:11 fiveoften ccsd[3391]: Cluster is not quorate.  Refusing
connection.
Dec 10 14:09:11 fiveoften ccsd[3391]: Error while processing connect:
Connection refused
Dec 10 14:09:12 fiveoften ccsd[3391]: Cluster is not quorate.  Refusing
connection.
Dec 10 14:09:12 fiveoften ccsd[3391]: Error while processing connect:
Connection refused
Dec 10 14:09:13 fiveoften ccsd[3391]: Cluster is not quorate.  Refusing
connection.
Dec 10 14:09:13 fiveoften ccsd[3391]: Error while processing connect:
Connection refused
Dec 10 14:09:14 fiveoften ccsd[3391]: Cluster is not quorate.  Refusing
connection.
Dec 10 14:09:14 fiveoften ccsd[3391]: Error while processing connect:
Connection refused
Dec 10 14:09:15 fiveoften ccsd[3391]: Cluster is not quorate.  Refusing
connection.
Dec 10 14:09:15 fiveoften ccsd[3391]: Error while processing connect:
Connection refused
Dec 10 14:09:16 fiveoften ccsd[3391]: Cluster is not quorate.  Refusing
connection.
Dec 10 14:09:16 fiveoften ccsd[3391]: Error while processing connect:
Connection refused
Dec 10 14:09:17 fiveoften ccsd[3391]: Cluster is not quorate.  Refusing
connection.
Dec 10 14:09:17 fiveoften ccsd[3391]: Error while processing connect:
Connection refused
Dec 10 14:09:18 fiveoften ccsd[3391]: Cluster is not quorate.  Refusing
connection.
Dec 10 14:09:18 fiveoften ccsd[3391]: Error while processing connect:
Connection refused
Dec 10 14:09:19 fiveoften ccsd[3391]: Cluster is not quorate.  Refusing
connection.
Dec 10 14:09:19 fiveoften ccsd[3391]: Error while processing connect:
Connection refused
Dec 10 14:09:20 fiveoften ccsd[3391]: Cluster is not quorate.  Refusing
connection.
Dec 10 14:09:20 fiveoften ccsd[3391]: Error while processing connect:
Connection refused

The logs stopped when fence_tool exited.

On node fiveoften, /proc/cluster/status and /proc/cluster/nodes contain:

[mbrookov at fiveoften ~]$ more /proc/cluster/status
Protocol version: 4.0.1
Config version: 6
Cluster name: CSMTEST
Cluster ID: 9374
Membership state: Cluster-Member
Nodes: 2
Expected_votes: 1
Total_votes: 2
Quorum: 1
Active subsystems: 0
Node addresses: 138.67.4.25

[mbrookov at fiveoften ~]$ more /proc/cluster/nodes
Node  Votes Exp Sts  Name
   1    1    1   M   fouroften
   2    1    1   M   fiveoften
[mbrookov at fiveoften ~]$

On node fouroften, /proc/cluster/status and /proc/cluster/nodes contain:

[mbrookov at fouroften ~]$ more /proc/cluster/status
Protocol version: 4.0.1
Config version: 6
Cluster name: CSMTEST
Cluster ID: 9374
Membership state: Cluster-Member
Nodes: 2
Expected_votes: 1
Total_votes: 2
Quorum: 1
Active subsystems: 1
Node addresses: 138.67.4.24

[mbrookov at fouroften ~]$ more /proc/cluster/nodes
Node  Votes Exp Sts  Name
   1    1    1   M   fouroften
   2    1    1   M   fiveoften

Log entries on node fouroften:
Dec 10 14:08:36 fouroften kernel: Lock_Harness <CVS> (built Dec 10 2004
09:14:45) installed
Dec 10 14:08:36 fouroften kernel: GFS <CVS> (built Dec 10 2004 09:14:04)
installed
Dec 10 14:08:36 fouroften kernel: CMAN <CVS> (built Dec 10 2004
09:51:59) installed
Dec 10 14:08:36 fouroften kernel: NET: Registered protocol family 30
Dec 10 14:08:36 fouroften kernel: DLM <CVS> (built Dec 10 2004 09:52:25)
installed
Dec 10 14:08:36 fouroften kernel: Lock_DLM (built Dec 10 2004 09:14:25)
installed
Dec 10 14:08:37 fouroften kernel: CMAN: Waiting to join or form a
Linux-cluster
Dec 10 14:09:09 fouroften kernel: CMAN: forming a new cluster
Dec 10 14:09:09 fouroften kernel: CMAN: quorum regained, resuming
activity
Dec 10 14:09:09 fouroften kernel: CMAN: got node fiveoften


/etc/cluster/cluster.conf:
<?xml version="1.0"?>
<cluster name="CSMTEST" config_version="6">

<cman two_node="1" expected_votes="1">
</cman>

<clusternodes>
	<clusternode name="fouroften" votes="1">
		<fence>
			<method name="cascade1">
				<device name="human" ipaddr="fouroften"/>
			</method>
		</fence>
	</clusternode>

	<clusternode name="fiveoften" votes="1">
		<fence>
			<method name="cascade1">
				<device name="human" ipaddr="fiveoften"/>
			</method>
		</fence>
	</clusternode>

</clusternodes>

<fencedevices>
        <fencedevice name="human" agent="fence_manual"/>
</fencedevices>

</cluster>

Both nodes are running Fedora Core 3 with the 2.6.9 kernel from
kernel.org.

Thanks for you time!

Matt
mbrookov at mines.edu





More information about the Linux-cluster mailing list