[Linux-cluster] node failing

Lon Hohberger lhh at redhat.com
Wed Jul 14 15:20:04 UTC 2004


On Wed, 2004-07-14 at 11:48 +1200, Royce Brown wrote:

> I am trying to track down a problem I’ll been having with the
> clustering software on redhat 3.0 (supplied rpm’s).  

This would be taroon-list material, actually.

> I am running a 2 cluster node using Multicast Heartbeat, Network
> Tiebreaker IP address and have bonded Ethernet interfaces to different
> switches. 

Good.  Try running in HA-bonded/failover mode if you're not already.

> There is no networking problems that I can see. On the bad node I can
> ping the other node by it’s address and the multicast address. I have
> full debug mode on, but the log files don’t show anything.

You should file a support ticket with Red Hat Support:

http://www.redhat.com/apps/support

> Has any one else seen this problem or can give me some tips what to
> look at next ?

Try the latest package from the RHN beta channel if you have access to
it, it fixes a problem which causes membership to enter an infinite loop
in some cases where timeouts occurred.  The infinite loop causes
multiple clumembd (or cluquorumd) processes to appear.

Here's a ref to the bugzilla:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=126316

-- Lon




More information about the Linux-cluster mailing list