[Linux-cluster] node fails to join cluster after it was fenced
Frederik Ferner
frederik.ferner at diamond.ac.uk
Thu Feb 15 11:36:03 UTC 2007
On Thu, 2007-02-15 at 09:07 +0000, Patrick Caulfield wrote:
> Frederik Ferner wrote:
> > On Wed, 2007-02-14 at 16:33 +0000, Patrick Caulfield wrote:
> >> Frederik Ferner wrote:
> >>> I've just discovered that I seem to have the same problem on one more
> >>> cluster, so maybe I've change something that causes this but did not
> >>> affect a running cluster. I'll append the cluster.conf for the original
> >>> cluster as well.
> >>>
> >> Looking at the tcpdump it seems that the existing node isn't seeing the joinreq
> >> message from the fenced one - there are no responses to it at all. You haven't
> >> enabled any iptables filtering have you ?
> >
> > But they seem to reach the network card at least, correct? So I don't
> > have to start looking at the switch, should I?
>
> Well, they are reaching tcpdump - more than that is hard to say ;-)
> You could try running tcpdump on the two machines to see if the packets are the
> same on both...if so then it could be some strange bug in cman that we've not
> seen before that's preventing it seeing incoming packets (I have no idea what
> that might be though, off hand)
I've had a look at the tcpdump on both machines at the same time. The
packets look identical to me. I've attached the two tcpdump files, maybe
someone can see a difference that I'm missing.
> It would be interesting to know - though you may not want to do it - if the
> problem persists when the still-running node is rebooted.
Obviously not at the moment, but I have a maintenance window upcoming
soon where I might be able to do that. I'll keep you informed about the
result.
Thanks for looking into that,
Frederik
--
Frederik Ferner
Systems Administrator Phone: +44 (0)1235-778624
Diamond Light Source Fax: +44 (0)1235-778468
More information about the Linux-cluster
mailing list