[Linux-cluster] GFS2 cluster and fencing

Ray Van Dolson rvandolson at esri.com
Wed Jun 10 20:54:43 UTC 2009


I'm setting up a simple 5 node "cluster" basically just for using a
shared GFS2 filesystem between the nodes.

I'm not really concerned about HA, I just want to be able to have all
the nodes accessing the same block device (iSCSI)

<?xml version="1.0"?>
<cluster alias="pds" config_version="6" name="pds">
  <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
  <clusternodes>
    <clusternode name="pds27.esri.com" nodeid="1" votes="1">
      <fence>
        <method name="human">
          <device name="human" nodename="pds27.esri.com"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="pds28.esri.com" nodeid="2" votes="1">
      <fence>
        <method name="human">
          <device name="human" nodename="pds28.esri.com"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="pds29.esri.com" nodeid="3" votes="1">
      <fence>
        <method name="human">
          <device name="human" nodename="pds29.esri.com"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="pds30.esri.com" nodeid="4" votes="1">
      <fence>
        <method name="human">
          <device name="human" nodename="pds30.esri.com"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="pds30.esri.com" nodeid="5" votes="1">
      <fence>
        <method name="human">
          <device name="human" nodename="pds30.esri.com"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <cman expected_votes="1"/>
  <fencedevices>
    <fencedevice name="human" agent="fence_manual"/>
  </fencedevices>
</cluster>

In my thinking this sets up a cluster where only one node need be up to
have quorum, and manual fencing is done for each node.

However, when I start up the first node in the cluster, the fencing
daemon hangs complaining about not being able to fence the other nodes.
I have to run fence_ack_manual -n <nodename> for all the other nodes,
then things start up fine.

Is there a way to make the node just assume all the other nodes are
fine and start up?  Am I really running much risk of the GFS2
filesystem failing out?

Thanks,
Ray




More information about the Linux-cluster mailing list