[Linux-cluster] cman_tool join causes other nodes to kernel panic
Dan B. Phung
phung at cs.columbia.edu
Mon May 16 07:02:49 UTC 2005
yes, I updated the cluster.conf by adding more nodes.
e.g., I added a couple of these blocks (6 more nodes to be exact)
<clusternode name="blade06" nodeid="1" votes="1">
<multicast addr="224.0.0.18" interface="eth0"/>
<fence>
<method name="single">
<device name="human" ipaddr="129.58.15.6"/>
</method>
</fence>
</clusternode>
I first updated the file (incrementing the version), and then ran:
ccs_tool update cluster.conf
cman_tool version -r 3
These commands completed without incident. The failure occured when
running 'cman_tool join -w' on the new node.
On 16, May, 2005, David Teigland declared:
> On Sun, May 15, 2005 at 11:06:23AM -0400, Dan B. Phung wrote:
> > I was adding another node to my cluster, so I updated the configurations
> > and did cman_tool join -w, which caused all the other nodes to kernel
> > panic, which prompted reboot of the cluster. I pasted the syslog of the
> > blade I just added and the kernel panic message from the other blades
> > below. I've done this same procedure several times before, so I don't
> > know why this time it caused this assertion.
> >
> > on the other machines, I see this:
> >
> > SM: Assertion failed on line 52 of file
> > /usr/src/cluster-2.6.9/cman-kernel/src/sm_misc.c
> > SM: assertion: "!error"
> > SM: time = 272181619
>
> This means there's some sort of internal consistency error within cman.
> If you could explain in more detail the steps you took prior to this I'll
> try to reproduce it. It sounds like you may have been updating
> cluster.conf while the cluster was running. If so, what exactly did you
> change?
>
> Dave
>
--
More information about the Linux-cluster
mailing list