[Linux-cluster] Cluster 3.0.0.rc3 release

Guido Günther agx at sigxcpu.org
Mon Jun 29 18:48:48 UTC 2009


Hi Fabione,
Thanks for rolling this rc candidate!

On Sat, Jun 20, 2009 at 01:19:49PM +0200, Fabio M. Di Nitto wrote:
[..snip..] 
> In order to build the 3.0.0.rc3 release you will need:
> 
> - corosync 0.98
> - openais 0.97
We used these without any patches.

> - linux kernel 2.6.29
We were running against 2.6.30.

We observed these issues:

fenced segfaults with:

(gdb) bt
#0  0x00007f8e293508fe in fence_node (victim=0x114b510 "node1.foo.bar", log=0x61e0a0, log_size=32, log_count=0x7fff2e46a634) at /var/home/schmitz/3/redhat-cluster/fence/libfence/agent.c:156
#1  0x000000000040c5cd in fence_victims (fd=0x114f270) at /var/home/schmitz/3/redhat-cluster/fence/fenced/recover.c:319
#2  0x0000000000405f27 in apply_changes (fd=0x114f270) at /var/home/schmitz/3/redhat-cluster/fence/fenced/cpg.c:1056
#3  0x00007f8e2914bcc1 in cpg_dispatch () from /usr/lib/libcpg.so.4 #4  0x0000000000404588 in process_fd_cpg (ci=4) at /var/home/schmitz/3/redhat-cluster/fence/fenced/cpg.c:1351 #5  0x000000000040b0f7 in main (argc=<value optimized out>, argv=<value optimized out>) at /var/home/schmitz/3/redhat-cluster/fence/fenced/main.c:818

this leads to

1246297857 fenced 3.0.0.rc3 started
1246297857 our_nodeid 1 our_name node2.foo.bar
1246297857 logging mode 3 syslog f 160 p 6 logfile p 6 /var/log/cluster/fenced.log
1246297857 found uncontrolled entry /sys/kernel/dlm/rgmanager

when trying to restart fenced. Since this is not possible one has to
reboot the node.

We're also seeing:

Jun 29 19:29:03 node2 kernel: [   50.149855] dlm: no local IP address has been set
Jun 29 19:29:03 node2 kernel: [   50.150035] dlm: cannot start dlm lowcomms -107

from time to time. Stopping/starting via cman's init script (as from the
Ubuntu package) several times makes this go away.

Any ideas what causes this?
Cheers,
 -- Guido




More information about the Linux-cluster mailing list