[Linux-cluster] cman_too join fails

Katriel Traum katriel at penguin-it.co.il
Tue May 15 13:42:10 UTC 2007


aisexec wasn't running. The error message is misleading, it seems that
it gets 2 error messages but only displays one because of a \0 in the
middle. There are the few last line from an "strace -f -w /tmp/trace
cman_tool join":

2133  socket(PF_FILE, SOCK_STREAM, 0)   = 5
2133  connect(5, {sa_family=AF_FILE, path="/var/run/cluster/ccsd.sock"},
110) = 0
2133  write(5,
"\3\0\0\0\0\0\0\0\360\0\0\0\0\0\0\0C\0\0\0/cluster/clusternodes/clusternode[@name=\"r5-nd1\"]/altname[1]/@name\0",
87) = 87
2133  read(5, "\3\0\0\0\0\0\0\0\360\0\0\0\303\377\377\377\0\0\0\0", 20) = 20
2133  close(5)                          = 0
2133  socket(PF_FILE, SOCK_STREAM, 0)   = 5
2133  connect(5, {sa_family=AF_FILE, path="/var/run/cluster/ccsd.sock"},
110) = 0
2133  write(5,
"\3\0\0\0\0\0\0\0\360\0\0\0\0\0\0\0\30\0\0\0/cluster/cman/@two_node\0",
44) = 44
2133  read(5, "\3\0\0\0\0\0\0\0\360\0\0\0\0\0\0\0\2\0\0\0", 20) = 20
2133  read(5, "1\0", 2)                 = 2
2133  close(5)                          = 0
2133  socket(PF_FILE, SOCK_STREAM, 0)   = 5
2133  connect(5, {sa_family=AF_FILE, path="/var/run/cluster/ccsd.sock"},
110) = 0
2133  write(5, "\2\0\0\0\0\0\0\0\360\0\0\0\0\0\0\0\0\0\0\0", 20) = 20
2133  read(5, "\2\0\0\0\0\0\0\0\377\377\377\377\0\0\0\0\0\0\0\0", 20) = 20
2133  close(5)                          = 0
2133  write(4, "Cannot start, ais may already be running\0", 41) = 41
2133  write(4, "Error reading config from CCS\0", 30) = 30
2133  --- SIGSEGV (Segmentation fault) @ 0 (0) ---
2133  rt_sigaction(SIGSEGV, {SIG_DFL}, {0x8060ec0, [SEGV], SA_RESTART},
8) = 0
2133  futex(0x817bb04, FUTEX_WAIT, 2, NULL <unfinished ...>
2131  <... select resumed> )            = 1 (in [3], left {0, 100000})
2131  read(3, "Cannot start, ais may already be running\0Error reading
config from CCS\0", 1024) = 71
2131  write(2, "cman not started: Cannot start, ais may already be
running\n", 59) = 59
2131  write(2, "cman_tool: ", 11)       = 11
2131  write(2, "aisexec daemon didn\'t start\n", 28) = 28
2131  exit_group(1)                     = ?

I would gladly send you an entire trace, I wasn't sure if I can attach
text files to mails to the list.

+Katriel

Patrick Caulfield wrote:
> Katriel Traum wrote:
> 
> That conf file looks OK.
> 
>> cman_tool join -d gives no extra information on the subject.
>> Could this be because I'm running inside a VM?
>>
> 
> Possible, but not very likely - I have a 6 node Xen cluster that works fine.
> 
> Did you check syslog for any other messages ? The things that cause that message
> are usually related to CCS problems, and they can appear in syslog.
> 
> 
> Oh, hang on: "ais may already be running" ---
> 
> Have you started openais separately? - you don't need to. cman will start it for
> you with the right parameters.
> 

-- 
Katriel Traum
CTO, Penguin IT
RHCE, CLP
Tel: 03-9411224
Mobile: 054-6789953
http://www.penguin-it.co.il




More information about the Linux-cluster mailing list