[Linux-cluster] Cman (and corosync) starting before network interface is ready

Vallevand, Mark K Mark.Vallevand at UNISYS.com
Wed Sep 17 15:20:54 UTC 2014


Tried replacing the switch with a crossover cable.  The problem goes away.  It looks like there is some odd delay in the switch.  The NIC is configured, but it takes 4 seconds for the link to go up.  Huh.

We have a dedicated network for all the cluster traffic.  Nothing else uses it.  In the two-node case, we use a cable.  In larger clusters we will use a switch.  First delivery is for two-node clusters.  But, I worry about that slow switch.

Regards.
Mark K Vallevand
"If there are no dogs in Heaven, then when I die I want to go where they went."
-Will Rogers

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Vallevand, Mark K
Sent: Tuesday, September 16, 2014 04:20 PM
To: linux clustering
Subject: [Linux-cluster] Cman (and corosync) starting before network interface is ready

It looks like there is some odd delay in getting a network interface up and ready.  So, when cman starts corosync, it can't get to the cluster.  So, for a time, the node is a member of a cluster-of-one.  The cluster-of-one begins starting resources.  A few seconds later, when the interface finally is up and ready, it takes about 30 more seconds for the cluster-of-one to finally rejoin the larger cluster.  The doubly-started resources are sorted out and all ends up OK.

Now, this is not a good thing to have these particular resources running twice.  I'd really like the clustering software to behave better.  But, I'm not sure what 'behave better' would be.

Is it possible to introduce a delay into cman or corosync startup?  Is that even wise?
Is there a parameter to get the clustering software to poll more often when it can't rejoin the cluster?

Any suggestions would be welcome.

Running Ubuntu 12.04 LTS.  Pacemaker 1.1.6.  Cman 3.1.7.  Corosync 1.4.2.

Regards.
Mark K Vallevand
"If there are no dogs in Heaven, then when I die I want to go where they went."
-Will Rogers

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20140917/4549a584/attachment.htm>


More information about the Linux-cluster mailing list