[Linux-cluster] RHEL Cluster Suite + Xen Dom0 = infinite reboots

Juan Ramon Martin Blanco robejrm at gmail.com
Wed Jul 8 22:10:41 UTC 2009


On Wed, Jul 8, 2009 at 11:52 PM, Aaron Benner <tfrumbacher at gmail.com> wrote:

> I have 3 xen Dom0 machines upon which I'm trying to build a cluster for HA
> DomUs.  At present the cluster config file simply lists the 3 nodes.  No
> fencing, services, resources or failover domains have been defined.  I know
> that this is not what I will need moving to production.  I was using the
> most minimal cluster config I could to ensure that my problem was the
> interaction of Xen and the cluster suite.
>
> The problem is this:  when a node reboots it joins the cluster
> successfully, then xen tears down the network to build xenbr0, vif0.0, and
> peth0 (standard /etc/xen/scripts/network-bridge).  When this happens the
> rebooting node "fails" in the cluster's eyes.  The active nodes try to fence
> it.  Originally I had power fencing enabled and this situation resulted in
> the shootout at the o.k. corral with the failed node booting, failing and
> getting fenced forever.
>
So, you have fencing configured among the domU's cluster but not in the dom0
cluster, haven't you? And this behavior happens in the dom0's cluster. Maybe
you should configure an additional physical network interface (or bonding of
interfaces) independent from the one used by xen to be used as the cluster
main comms interface.

Greetings,
Juanra

>
> I did find the gem at the very bottom of the FAQ in the GeneralQuestions
> section (
> http://sources.redhat.com/cluster/wiki/FAQ/GeneralQuestions#xencluster)
> that mentions this situation.  The "workaround" which also mentions a "more
> permanent solution" seems, well, clunky so I thought I'd ping the list to
> see if the more permanent solution exists and is just not well documented or
> if others have found a solution that doesn't require override of the default
> xen script behavior?
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20090709/b54dabe2/attachment.htm>


More information about the Linux-cluster mailing list