[rhn-users] RHCS quorum/multipath

Frank Clements fclements at inetu.net
Mon Sep 28 13:05:38 UTC 2009


Hello list,

I'm in the process of setting up a two node cluster attached to a Dell
MD3000i.  Both nodes are connected via two switches to provide full backend
redundancy.  The dell MPP drivers are installed and reporting everything as
Ok, path failover is working as expected (although longer than I would
expect).

The issue I'm running into is that in the event of path failure CMAN
complains about losing contact with the quorum and eventually fences the
node which lost the path.  I've attempted tuning the totem and
quorum_dev_poll to higher values (all configs/logs attached).  The quorum is
slightly lower than both.  I actually followed the RH KB doc 2882 which
states to set the quorum timeout (interval * tko) to a value 1.7x larger
than the multipath failover value and totem to a value 2.7x greater.  

Now, there is little no documentation from Dell on what the values in
mpp.conf mean.  I _think_ I've reset these to some fairly low values, but
have since reverted back to the default config in the initrd image.  In the
attached messages file I shutdown one of the backend interfaces to simulate
path failure, after about a minute the node is evicted and fenced.  I just
can't seem to get all the values to line up to allow path failover without
node fencing.

I'm wondering if anyone else has experienced this and how it's resolved?
I'm at a loss at this point so any help is appreciated.

Frank Clements
-------------- next part --------------
A non-text attachment was scrubbed...
Name: messages
Type: application/octet-stream
Size: 2964 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/rhn-users/attachments/20090928/beec6e85/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mpp.conf
Type: application/octet-stream
Size: 697 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/rhn-users/attachments/20090928/beec6e85/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cluster.conf
Type: application/octet-stream
Size: 3047 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/rhn-users/attachments/20090928/beec6e85/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3568 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/rhn-users/attachments/20090928/beec6e85/attachment.bin>


More information about the rhn-users mailing list