[Linux-cluster] Heartbeat tolerance on busy networks (RHCS3)

Karl Podesta kpodesta at redbrick.dcu.ie
Wed Feb 20 15:37:07 UTC 2008


On Tue, Feb 19, 2008 at 09:43:03PM +0100, Pavlos Parissis wrote:
> It would be much better and useful for everyone to post your clutser.conf and the error you get.
> 
> You can always test your configuration prior using it by using rg_test tool.
> 
> Cheers,
> Pavlos

Ok - I make the change in cluster.conf (see below), then do:

  ccs_tool update /etc/cluster/cluster.conf
  cman_tool version -r 68

... this works fine, no errors. Clustat reports cluster working as normal, 
no messages in /var/log/messages etc. But then when I enter system-config-
cluster, I get a window with an error telling me it can't read the config
file (error is below). Any help you can give is appreciated! Or even if
you'd recommend something different to allow tolerance for heartbeats. 

Incidentally - it appears deadnode_timeout isn't a variable in the cman 
section in the RHEL 4 cluster.conf schema:
(http://sources.redhat.com/cluster/doc/cluster_schema_rhel4.html)
... so maybe deadnode_timeout isn't a valid variable anymore?


Errors (from system-config-cluster) :
=====================================

  "A problem was encountered while reading configuration file ..
   Details or the error appear below. Click the 'New' button to create
   a new configuration file. To continue anyway (Not Recommended!), 
   click the 'Ok' button

/etc/cluster/cluster.conf:20: element cman: Relax-NG validity error : Invalid
attribute deadnode_timeout for element cman
/etc/cluster/cluster.conf:20: element cman: Relax-NG validity error :
Expecting element gulm, got cman
/etc/cluster/cluster.conf:2: element cluster: Relax-NG validity error :
Invalid sequence in interleave
/etc/cluster/cluster.conf:2: element cluster: Relax-NG validity error :
Element cluster failed to validate content
/etc/cluster/cluster.conf:15: element device: validity error : IDREF attribute
name references an unknown ID "cluster2-drac"
/etc/cluster/cluster.conf:8: element device: validity error : IDREF attribute
name references an unknown ID "cluster1-drac"
/etc/cluster/cluster.conf fails to validate

cluster.conf:
============

<?xml version="1.0"?>
<cluster config_version="68" name="mycluster">
  <fence_daemon post_fail_delay="0" post_join_delay="3"/>
  <clusternodes>
    <clusternode name="cluster1" votes="1">
      <fence>
        <method name="1">
          <device name="cluster1-drac"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="cluster2" votes="1">
      <fence>
        <method name="1">
          <device name="cluster2-drac"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <cman expected_votes="1" two_node="1" deadnode_timeout="60"/>
  <fencedevices>
    <fencedevice agent="fence_drac" ipaddr="X.73" login="root" name="cluster1-drac" passwd="X"/>
    <fencedevice agent="fence_drac" ipaddr="X.74" login="root" name="cluster2-drac" passwd="X"/>
  </fencedevices>
  <rm>
    <failoverdomains>
        <failoverdomain name="mysql" ordered="0" restricted="1">
          <failoverdomainnode name="cluster1" priority="1"/>
          <failoverdomainnode name="cluster2" priority="1"/>
        </failoverdomain>
    </failoverdomains>
    <resources>
      <fs device="/dev/emcpowera1" force_fsck="0"
force_unmount="1" fsid="6939" fstype="ext3" mountpoint="/disk1" name="/disk1"
options="" self_fence="1"/>
      <ip address="X.70" monitor_link="1"/>
      <script file="/etc/init.d/mysqld" name="mysqld"/>
      <script file="/etc/init.d/httpd" name="apache"/>
    </resources>
    <service autostart="1" name="all_cluster_services">
      <ip ref="X.70"/>
      <fs ref="/disk1"/>
      <script ref="mysqld"/>
      <script ref="apache"/>
    </service>
  </rm>
</cluster>

-- 
Karl Podesta
Systems Engineer, Securelinx Ltd., Ireland
http://www.securelinx.ie/




More information about the Linux-cluster mailing list