[Linux-cluster] startup problem: "cman not started: Can't bind to local cman socket /usr/sbin/cman_tool: aisexec daemon didn't start "

Steven Dake sdake at redhat.com
Wed Apr 25 20:53:24 UTC 2007


After cman returns the "aisexec daemon didn't start" is the process
"aisexec" running?  If it isn't, there should be a core file
in /var/run/openais I believe.  Install the openais-debug package if
using RPMS, or build a debug version of all of the infrastructure and
use gdb to get a backtrace.

gdb /usr/sbin/aisexec /var/run/openais/core.xxxx
run the where command

regards
-steve


On Wed, 2007-04-25 at 18:26 +0200, Brieseneck, Arne, VF-Group wrote:
> it is high likely that anybody already knows this error and knows how
> to fix it: 
> 
> ---snip--- 
> [root at box1 ~]# /etc/init.d/cman start 
> Starting cluster: 
> Loading modules... done 
> Mounting configfs... done 
> Starting ccsd... done 
> Starting cman... failed 
> cman not started: Can't bind to local cman socket /usr/sbin/cman_tool:
> aisexec daemon didn't start 
> [FAILED] 
> [root at box1 ~]# 
> 
> ---snap--- 
> 
> the cluster does not start even what is written in the logfileif looks
> OK for me:
> 
> ---snip--- 
> 
> Apr 24 19:59:06 box1 ccsd[5129]: Starting ccsd 2.0.60: 
> Apr 24 19:59:06 box1 ccsd[5129]: Built: Jan 24 2007 15:31:03 
> Apr 24 19:59:06 box1 ccsd[5129]: Copyright (C) Red Hat, Inc. 2004 All
> rights reserved. 
> Apr 24 19:59:06 box1 ccsd[5129]: cluster.conf (cluster name =
> alpha_cluster, version = 6) found. 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] AIS Executive Service
> RELEASE 'subrev 1204 version 0.80.1' 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] Copyright (C) 2002-2006
> MontaVista Software, Inc and contributors. 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] Copyright (C) 2006 Red
> Hat, Inc. 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] Using default multicast
> address of 239.192.196.121 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] openais component
> openais_cpg loaded. 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] Registering service
> handler 'openais cluster closed process group service v1.01' 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] openais component
> openais_cfg loaded. 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] Registering service
> handler 'openais configuration service' 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] openais component
> openais_msg loaded. 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] Registering service
> handler 'openais message service B.01.01' 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] openais component
> openais_lck loaded. 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] Registering service
> handler 'openais distributed locking service B.01.01' 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] openais component
> openais_evt loaded. 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] Registering service
> handler 'openais event service B.01.01' 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] openais component
> openais_ckpt loaded. 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] Registering service
> handler 'openais checkpoint service B.01.01' 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] openais component
> openais_amf loaded. 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] Registering service
> handler 'openais availability management framework B.01.01' 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] openais component
> openais_clm loaded. 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] Registering service
> handler 'openais cluster membership service B.01.01' 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] openais component
> openais_evs loaded. 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] Registering service
> handler 'openais extended virtual synchrony service' 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] openais component
> openais_cman loaded. 
> Apr 24 19:59:08 box1 openais[5135]: [MAIN ] Registering service
> handler 'openais CMAN membership service 2.01' 
> Apr 24 19:59:08 box1 openais[5070]: [TOTEM] entering GATHER state from
> 12. 
> Apr 24 19:59:08 box1 openais[5070]: [TOTEM] Creating commit token
> because I am the rep. 
> Apr 24 19:59:08 box1 openais[5070]: [TOTEM] Saving state aru 1e high
> seq received 1e 
> Apr 24 19:59:08 box1 openais[5070]: [TOTEM] Storing new sequence id
> for ring 304 
> Apr 24 19:59:08 box1 openais[5070]: [TOTEM] entering COMMIT state. 
> Apr 24 19:59:09 box1 openais[5070]: [TOTEM] entering RECOVERY state. 
> Apr 24 19:59:09 box1 openais[5070]: [TOTEM] position [0] member
> 192.168.50.194: 
> Apr 24 19:59:09 box1 openais[5070]: [TOTEM] previous ring seq 768 rep
> 192.168.50.194 
> Apr 24 19:59:09 box1 openais[5070]: [TOTEM] aru 1e high delivered 1e
> received flag 0 
> Apr 24 19:59:09 box1 openais[5070]: [TOTEM] position [1] member
> 192.168.50.195: 
> Apr 24 19:59:09 box1 openais[5070]: [TOTEM] previous ring seq 768 rep
> 192.168.50.194 
> Apr 24 19:59:09 box1 openais[5070]: [TOTEM] aru 1e high delivered 1e
> received flag 0 
> Apr 24 19:59:09 box1 openais[5070]: [TOTEM] position [2] member
> 192.168.50.196: 
> Apr 24 19:59:09 box1 openais[5070]: [TOTEM] previous ring seq 768 rep
> 192.168.50.194 
> Apr 24 19:59:09 box1 openais[5070]: [TOTEM] aru 1e high delivered 1e
> received flag 0 
> Apr 24 19:59:09 box1 openais[5070]: [TOTEM] Did not need to originate
> any messages in recovery. 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] CLM CONFIGURATION CHANGE 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] New Configuration: 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] r(0) ip(192.168.50.194) 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] r(0) ip(192.168.50.195) 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] r(0) ip(192.168.50.196) 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] Members Left: 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] Members Joined: 
> Apr 24 19:59:09 box1 openais[5070]: [SYNC ] This node is within the
> primary component and will provide service. 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] CLM CONFIGURATION CHANGE 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] New Configuration: 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] r(0) ip(192.168.50.194) 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] r(0) ip(192.168.50.195) 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] r(0) ip(192.168.50.196) 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] Members Left: 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] Members Joined: 
> Apr 24 19:59:09 box1 openais[5070]: [SYNC ] This node is within the
> primary component and will provide service. 
> Apr 24 19:59:09 box1 openais[5070]: [TOTEM] entering OPERATIONAL
> state. 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] got nodejoin message
> 192.168.50.195 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] got nodejoin message
> 192.168.50.196 
> Apr 24 19:59:09 box1 openais[5070]: [CLM ] got nodejoin message
> 192.168.50.194 
> Apr 24 19:59:35 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 30 seconds. 
> Apr 24 20:00:05 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 60 seconds. 
> Apr 24 20:00:35 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 90 seconds. 
> Apr 24 20:01:05 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 120 seconds. 
> Apr 24 20:01:35 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 150 seconds. 
> Apr 24 20:02:05 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 180 seconds. 
> Apr 24 20:02:35 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 210 seconds. 
> Apr 24 20:03:05 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 240 seconds. 
> Apr 24 20:03:35 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 270 seconds. 
> Apr 24 20:04:05 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 300 seconds. 
> Apr 24 20:04:35 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 330 seconds. 
> Apr 24 20:05:05 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 360 seconds. 
> Apr 24 20:05:35 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 390 seconds. 
> Apr 24 20:06:05 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 420 seconds. 
> Apr 24 20:06:35 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 450 seconds. 
> Apr 24 20:07:05 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 480 seconds. 
> Apr 24 20:07:35 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 510 seconds. 
> Apr 24 20:08:05 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 540 seconds. 
> Apr 24 20:08:35 box1 ccsd[5129]: Unable to connect to cluster
> infrastructure after 570 seconds. 
> Apr 24 20:09:01 box1 ccsd[5129]: Stopping ccsd, SIGTERM received. 
> 
> ---snap--- 
> 
> thats the /etc/cluster/cluster.conf:
> 
> ---snip--- 
> [root at box1 ~]# cat /etc/cluster/cluster.conf 
> <?xml version="1.0"?> 
> <cluster alias="alpha" config_version="6" name="alpha_cluster"> 
> <cman> 
> <multicast addr="224.0.0.1"/> 
> </cman> 
> <fence_daemon post_fail_delay="0" post_join_delay="3"/> 
> <clusternodes> 
> <clusternode name="box1" nodeid="1"> 
> <multicast addr="224.0.0.1" interface="eth0"/> 
> <fence> 
> <method name="1"> 
> <device name="human" nodename="box1.sbe" 
> </method> 
> </fence> 
> </clusternode> 
> <clusternode name="box2" votes="1" nodeid="2"> 
> <multicast addr="224.0.0.1" interface="eth0"/> 
> <fence> 
> <method name="1"> 
> <device name="human" nodename="box2.sbe" 
> </method> 
> </fence> 
> </clusternode> 
> <clusternode name="box3" votes="1" nodeid="4"> 
> <multicast addr="224.0.0.1" interface="eth0"/> 
> <fence> 
> <method name="1"> 
> <device name="human" nodename="box3.sbe" 
> </method> 
> </fence> 
> </clusternode> 
> </clusternodes> 
> <fencedevices> 
> <fencedevice agent="fence_manual" name="human"/> 
> </fencedevices> 
> <rm log_level="7" log_facility="syslog"> 
> <failoverdomains/> 
> <resources/> 
> </rm> 
> </cluster> 
> 
> 
> ---snap---
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




More information about the Linux-cluster mailing list