[Linux-cluster] RHCS 3 "could not connect to service manager"

Lon Hohberger lhh at redhat.com
Wed Nov 15 19:50:40 UTC 2006


On Wed, 2006-11-15 at 17:29 +0000, Karl Podesta wrote:

> When we do this, 50% of packets get through (i.e. load balancing is working
> and we can ping the other node), but the service fails to relocate with the
> above error. When we have both NICs enabled, 100% of packets get through, 
> and service relocation works fine. So this seems to establish that network
> activity/problems can disrupt the relocation of services if one of the nodes 
> is using load balancing on it's network bonding. Sound reasonable?

Looks like you found it.

I don't think you mentioned that one of the NICs was dead in your
original post.  Losing a NIC in active/active load-balancing bonding
will definitely cause problems.  The bonding driver isn't very smart
about losing a link in load-balancing mode.

I would switch both nodes to active/backup.

-- Lon






More information about the Linux-cluster mailing list