[Linux-cluster] rgmanager not running
Balaji Sundar
balajisundar at midascomm.com
Mon Mar 7 08:33:41 UTC 2011
Dear All,
I have using RHEL6 Linux and Kernel Version is 2.6.32-71.el6.i686
I have configured Cluster Suite with 2 servers
Server 1 : 192.168.13.131 IP Address and hostname is primary
Server 2 : 192.168.13.132 IP Address and hostname is secondary
Floating : 192.168.13.133 IP Address (Assumed by currently active server)
I have verified that service cman is running and cluster.conf is valid
using ccs_config_validate command
Finally i found that rgmanager is not running and services are not started
[root at primary cluster]# service rgmanager status
rgmanager dead but pid file exists
[root at primary cluster]#
[root at primary cluster]# cman_tool services
[root at primary cluster]#
[root at primary cluster]# cman_tool status
Version: 6.2.0
Config Version: 1
Cluster Name: EMSCluster
Cluster Id: 808
Cluster Member: Yes
Cluster Generation: 96
Membership state: Cluster-Member
Nodes: 1
Expected votes: 1
Total votes: 1
Node votes: 1
Quorum: 1
Active subsystems: 7
Flags: 2node
Ports Bound: 0
Node name: primary
Node ID: 1
Multicast addresses: 239.192.3.43
Node addresses: 192.168.13.131
[root at primary cluster]#
Found some error messages in "/var/log/messages" file
Mar 7 14:39:42 primary corosync[7155]: [CMAN ] quorum regained,
resuming activity
Mar 7 14:39:42 primary corosync[7155]: [QUORUM] This node is within the
primary component and will provide service.
Mar 7 14:39:42 primary corosync[7155]: [QUORUM] Members[1]: 1
Mar 7 14:39:42 primary corosync[7155]: [QUORUM] Members[1]: 1
Mar 7 14:39:42 primary corosync[7155]: [CPG ] downlist received
left_list: 0
Mar 7 14:39:42 primary corosync[7155]: [CPG ] chosen downlist from
node r(0) ip(192.168.13.131)
Mar 7 14:39:42 primary corosync[7155]: [MAIN ] Completed service
synchronization, ready to provide service.
Mar 7 14:39:44 primary fenced[7210]: fenced 3.0.12 started
Mar 7 14:39:45 primary dlm_controld[7224]: dlm_controld 3.0.12 started
Mar 7 14:39:45 primary gfs_controld[7254]: gfs_controld 3.0.12 started
Mar 7 14:39:45 primary kernel: dlm: Using TCP for communications
Mar 7 14:39:45 primary dlm_controld[7224]: dlm_join_lockspace no fence
domain
Mar 7 14:39:45 primary dlm_controld[7224]: process_uevent online@ error
-1 errno 2
Mar 7 14:39:45 primary kernel: dlm: rgmanager: group join failed -1 -1
Found some error messages in "/var/log/cluster/dlm_controld.log" file
Mar 07 14:39:45 dlm_controld dlm_controld 3.0.12 started
Mar 07 14:39:45 dlm_controld dlm_join_lockspace no fence domain
Mar 07 14:39:45 dlm_controld process_uevent online@ error -1 errno 2
I don't know what is the problem and Can some one throw light on this
peculiar problem
Thanks in Advance
--Regards
S.Balaji
More information about the Linux-cluster
mailing list