[Linux-cluster] node fails to join cluster after it was fenced
pcaulfie at redhat.com
Thu Feb 15 09:07:03 UTC 2007
Frederik Ferner wrote:
> On Wed, 2007-02-14 at 16:33 +0000, Patrick Caulfield wrote:
>> Frederik Ferner wrote:
>>> I've just discovered that I seem to have the same problem on one more
>>> cluster, so maybe I've change something that causes this but did not
>>> affect a running cluster. I'll append the cluster.conf for the original
>>> cluster as well.
>> Looking at the tcpdump it seems that the existing node isn't seeing the joinreq
>> message from the fenced one - there are no responses to it at all. You haven't
>> enabled any iptables filtering have you ?
> But they seem to reach the network card at least, correct? So I don't
> have to start looking at the switch, should I?
Well, they are reaching tcpdump - more than that is hard to say ;-)
It's hard to make much sense of the symptoms to be quite honest. If it's a
switch problem then I would expect it to affect running nodes as well as joining
ones - that's the whole point of the heartbeat!
You could try running tcpdump on the two machines to see if the packets are the
same on both...if so then it could be some strange bug in cman that we've not
seen before that's preventing it seeing incoming packets (I have no idea what
that might be though, off hand)
It would be interesting to know - though you may not want to do it - if the
problem persists when the still-running node is rebooted.
More information about the Linux-cluster