It sounds like they're fencing themselves. We got around this issue on a two-node cluster by including the alternate node's internal ip address in the /etc/hosts file of both hosts and a cross-over cable for the service network with the private ip addresses assigned to that network. If you're trying to get them to monitor each other via the public network, in theory this could be done with a backup fencing method, but we weren't able to get this work since the heartbeat functions only happen on the network that the node names are defined to use.<br> <br><div class="gmail_quote">On Mon, May 25, 2009 at 5:28 AM, ESGLinux <span dir="ltr"><<a href="mailto:esggrupos@gmail.com">esggrupos@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> Hi, <div><br></div><div>I think this is not my problem because fencing works fine. The nodes gets fenced inmediatly but I think they fence when they don't must </div><div><br></div><div>Greetings, </div><div><br></div> <div>ESG<br><br><div class="gmail_quote">2009/5/22 jorge sanchez <span dir="ltr"><<a href="mailto:xsanch@gmail.com" target="_blank">xsanch@gmail.com</a>></span><div><div></div><div class="h5"><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> Hi, <br><br>try also disable the acpi if is it running , see following:<br><br><a href="http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/s1-acpi-CA.html" target="_blank">http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/s1-acpi-CA.html</a><br> <br><br>Regards,<br><font color="#888888"><br>Jorge Sanchez</font><div><div></div><div><br><br><div class="gmail_quote">On Thu, May 21, 2009 at 5:34 PM, ESGLinux <span dir="ltr"><<a href="mailto:esggrupos@gmail.com" target="_blank">esggrupos@gmail.com</a>></span> wrote:<br> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> <br><br><div class="gmail_quote"><div>2009/5/21 Jonathan Brassow <span dir="ltr"><<a href="mailto:jbrassow@redhat.com" target="_blank">jbrassow@redhat.com</a>></span><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> <div><br><div> On May 21, 2009, at 9:57 AM, ESGLinux wrote:<br> <br> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> Hello,<br> <br> these are the logs I get:<br> <br> In node1:<br> <br> May 21 11:33:44 NODE1 fenced[3840]: NODE2 not a cluster member after 5 sec post_fail_delay<br> May 21 11:33:44 NODE1 fenced[3840]: fencing node "NODE2"<br> May 21 11:33:44 NODE1 shutdown[5448]: shutting down for system halt<br> <br> in node2:<br> <br> May 21 11:33:45 NODE2 fenced[3843]: NODE1 not a cluster member after 5 sec post_fail_delay<br> May 21 11:33:45 NODE2 fenced[3843]: fencing node "NODE1"<br> May 21 11:33:45 NODE2 shutdown[5923]: shutting down for system halt<br> <br> <br> what I don´t know is way they lose the connection with the cluster, they are still connected (I only unplug a cable from the service network)<br> </blockquote> <br></div></div><div> That may be something worth chasing down, as it appears that your cluster communication is on a network you don't expect?<br> </div></blockquote><div><br>How can I be sure about the network the nodes are using for communication? I think they do for the network I have configured to do that....<br> </div><div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> <br> Also, are the nodes simply "shutting down", or are they being forcibly rebooted. If it is a casual shutdown, then it would appear that both nodes are trying to shutdown simultaneously.<div><div></div><div> </div></div></blockquote></div><div><br>they simply shutdown. They no reboot. <br><br>This is what I get every time I unplug the nework cable from eth0 of any of the two nodes. (they communicate through eth1...)<br><br>Greetings, <br> <br>ESG<br><br><br><br> </div><div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><div><br> <br> brassow<br> <br> --<br> Linux-cluster mailing list<br> <a href="mailto:Linux-cluster@redhat.com" target="_blank">Linux-cluster@redhat.com</a><br> <a href="https://www.redhat.com/mailman/listinfo/linux-cluster" target="_blank">https://www.redhat.com/mailman/listinfo/linux-cluster</a><br> </div></div></blockquote></div></div><br> <br>--<br> Linux-cluster mailing list<br> <a href="mailto:Linux-cluster@redhat.com" target="_blank">Linux-cluster@redhat.com</a><br> <a href="https://www.redhat.com/mailman/listinfo/linux-cluster" target="_blank">https://www.redhat.com/mailman/listinfo/linux-cluster</a><br></blockquote></div><br> </div></div><br>--<br> Linux-cluster mailing list<br> <a href="mailto:Linux-cluster@redhat.com" target="_blank">Linux-cluster@redhat.com</a><br> <a href="https://www.redhat.com/mailman/listinfo/linux-cluster" target="_blank">https://www.redhat.com/mailman/listinfo/linux-cluster</a><br></blockquote></div></div></div><br></div> <br>--<br> Linux-cluster mailing list<br> <a href="mailto:Linux-cluster@redhat.com">Linux-cluster@redhat.com</a><br> <a href="https://www.redhat.com/mailman/listinfo/linux-cluster" target="_blank">https://www.redhat.com/mailman/listinfo/linux-cluster</a><br></blockquote></div><br>