[Linux-cluster] Cluster 3.0.0.rc3 release
agx at sigxcpu.org
Mon Jun 29 18:48:48 UTC 2009
Thanks for rolling this rc candidate!
On Sat, Jun 20, 2009 at 01:19:49PM +0200, Fabio M. Di Nitto wrote:
> In order to build the 3.0.0.rc3 release you will need:
> - corosync 0.98
> - openais 0.97
We used these without any patches.
> - linux kernel 2.6.29
We were running against 2.6.30.
We observed these issues:
fenced segfaults with:
#0 0x00007f8e293508fe in fence_node (victim=0x114b510 "node1.foo.bar", log=0x61e0a0, log_size=32, log_count=0x7fff2e46a634) at /var/home/schmitz/3/redhat-cluster/fence/libfence/agent.c:156
#1 0x000000000040c5cd in fence_victims (fd=0x114f270) at /var/home/schmitz/3/redhat-cluster/fence/fenced/recover.c:319
#2 0x0000000000405f27 in apply_changes (fd=0x114f270) at /var/home/schmitz/3/redhat-cluster/fence/fenced/cpg.c:1056
#3 0x00007f8e2914bcc1 in cpg_dispatch () from /usr/lib/libcpg.so.4 #4 0x0000000000404588 in process_fd_cpg (ci=4) at /var/home/schmitz/3/redhat-cluster/fence/fenced/cpg.c:1351 #5 0x000000000040b0f7 in main (argc=<value optimized out>, argv=<value optimized out>) at /var/home/schmitz/3/redhat-cluster/fence/fenced/main.c:818
this leads to
1246297857 fenced 3.0.0.rc3 started
1246297857 our_nodeid 1 our_name node2.foo.bar
1246297857 logging mode 3 syslog f 160 p 6 logfile p 6 /var/log/cluster/fenced.log
1246297857 found uncontrolled entry /sys/kernel/dlm/rgmanager
when trying to restart fenced. Since this is not possible one has to
reboot the node.
We're also seeing:
Jun 29 19:29:03 node2 kernel: [ 50.149855] dlm: no local IP address has been set
Jun 29 19:29:03 node2 kernel: [ 50.150035] dlm: cannot start dlm lowcomms -107
from time to time. Stopping/starting via cman's init script (as from the
Ubuntu package) several times makes this go away.
Any ideas what causes this?
More information about the Linux-cluster