[Linux-cluster] problem with GNBD device

Changer Van changerv at gmail.com
Fri Oct 12 02:18:50 UTC 2007


Hi all,

I set up a http HA cluster consist of 3 nodes.
Node 1 is set to gnbd server for fencing.
Node 2 and node 3 are set to http HA.
In case the http service is running on node 3.
Once the network cable of node 3 was unplug,
the service would shift to node 2 properly,
but cman service on node 3 was killed after the catble was plugged in,
and cman's pid file was still there.
The worse thing is cman service can not be started again,
and node 3 can not be shutdown.

OS: RHEL 5 (2.6.18-8.el5)
rpms related:
Cluster_Administration-en-US-5.0.0-5.noarch.rpm
cluster-cim-0.8-27.el5.i386.rpm
cluster-snmp-0.8-27.el5.i386.rpm
modcluster-0.8-27.el5.i386.rpm
rgmanager-2.0.23-1.i386.rpm
system-config-cluster-1.0.50-1.0.noarch.rpm
gnbd-1.1.5-1.el5.i386.rpm
kmod-gnbd-0.1.3-4.2.6.18_8.el5.i686.rpm
kmod-gnbd-PAE-0.1.3-4.2.6.18_8.el5.i686.rpm
kmod-gnbd-xen-0.1.3-4.2.6.18_8.el5.i686.rpm

partial log messages on node 3:
openais[6621]: [CPG  ] got joinlist message from node 1
openais[6621]: [CPG  ] got joinlist message from node 2
openais[6621]: [CMAN ] cman killed by node 3 for reason 2
gnbd_import: ERROR [../../utils/gnbd_utils.c:78] cman_init failed :
Connection refused
gfs_controld[6648]: cman_start_notification error -1 104
dlm_controld[6641]: cluster is down, exiting
fenced[6635]: cluster is down, exiting
fence_node[6645]: agent "fence_gnbd" reports: gnbd_import: ERROR cannot get
node name : Connection refused gnbd_import: ERROR If you are not planning to
use a cluster manager, use -n failed: fence_gnbd, node03
kernel: dlm: closing connection to node 3
fence_node[6645]: Fence of "node03" was unsuccessful
kernel: dlm: closing connection to node 2
kernel: dlm: closing connection to node 1
ccsd[6615]: Unable to connect to cluster infrastructure after 30 seconds.
ccsd[6615]: Unable to connect to cluster infrastructure after 60 seconds.

Any help would be greatly appreciated.

-- 
Regards,
Changer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20071012/5e746da5/attachment.htm>


More information about the Linux-cluster mailing list