[Linux-cluster] Using LVM / ext4 in cluster.conf more than one time

Thu Jun 6 16:37:06 UTC 2013

Hi !

We have 2 clusters of 8 nodes each. Since we begun using RHCS about 2.5
years ago, we mostly use GFS on a shared storage array for data storage.
In most cases, there is no need to use GFS, ext4 would be enough since
that filesystem is only used within one service.

Production services are enabled on one cluster at the time. A service
containing a webserver and data directories is running in cluster A. A
service also exists in cluster B that only use the data directories.
Data is then synced from cluster A to cluster B manually for recovery in
case of disaster. So in each cluster, we have a copy of 2 services. Both
use the same data file system, but only one start the webserver. I will
upload a verry simplified version of one cluster.conf to illustrate this
and the problem.

We begun migrating some GFS file system to ext4, as it offers more
performance, while retaining the same HA features in the event of a node
failure. And while doing so, we discover this problem.

While a ext3 using HA-LVM in a clustered environement cannot be mounted
on 2 nodes (that's fine), we do need for it to be defined 2 times in the
cluster.conf file.

So, in my exemple cluster.conf, you will see 2 services. Both uses the
same lvm and fs, but one starts a script, and the other doesn't. While
only one service can be active at the time in one cluster, and that's
fine for us, only the one will effectively mount the file system.

In my cluster.conf example, I have 2 services. SandBox would be the
"production" service starting the script and listening on the proper IP.
The SandBoxRecovery service would be the service I run in my other
cluster to receive the synced data from the "production" service.

In this exemple, the SandBox service would start fine, but
SandBoxRecovery would only starts the IPs, and not mount the FS.

With rg_test, I was able to see this warning : Warning: Max references
exceeded for resource name (type lvm)

and google led me to :

https://access.redhat.com/site/solutions/222453

Which pretty much explains it. You can see the comment on this solution
which is execly my question here :

So, how can I define a LVM and FS a second time in cluster.conf ?

------------ example cluster.conf ------------
<?xml version="1.0"?>
<cluster config_version="1199" name="CyberClusterAS">
  <cman/>
  <logging debug="off"/>
  <gfs_controld plock_ownership="1" plock_rate_limit="500"/>
  <clusternodes>
    <clusternode name="node201.lan.cybercat.priv" nodeid="1">
      <fence>
        <method name="1">
          <device name="node201-ipmi"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="node202.lan.cybercat.priv" nodeid="2">
      <fence>
        <method name="1">
          <device name="node202-ipmi"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="node203.lan.cybercat.priv" nodeid="3">
      <fence>
        <method name="1">
          <device name="node203-ipmi"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <fencedevices>
    <fencedevice agent="fence_ipmilan" ipaddr="192.168.113.151"
login="Admin" name="node201-ipmi" passwd="darkman."/>
    <fencedevice agent="fence_ipmilan" ipaddr="192.168.113.152"
login="Admin" name="node202-ipmi" passwd="darkman."/>
    <fencedevice agent="fence_ipmilan" ipaddr="192.168.113.153"
login="Admin" name="node203-ipmi" passwd="darkman."/>
  </fencedevices>
  <rm>
    <failoverdomains>
      <failoverdomain name="cybercat" nofailback="1" ordered="0"
restricted="0">
        <failoverdomainnode name="node201.lan.cybercat.priv" priority=""/>
        <failoverdomainnode name="node202.lan.cybercat.priv" priority=""/>
        <failoverdomainnode name="node203.lan.cybercat.priv" priority=""/>
      </failoverdomain>
    </failoverdomains>
    <resources>
      <lvm lv_name="SandBox" name="VGa-SandBox" vg_name="VGa"/>
      <fs device="/dev/VGa/SandBox" force_fsck="0" force_unmount="1"
fsid="64052" fstype="ext4" mountpoint="/CyberCat/SandBox" name="SandBox"
options="" self_fence="0"/>
    </resources>
    <service autostart="0" domain="cybercat" exclusive="0" name="SandBox">
      <ip address="192.168.110.43" monitor_link="on" sleeptime="1">
        <ip address="192.168.112.43" monitor_link="on" sleeptime="1"/>
        <lvm ref="VGa-SandBox">
          <fs ref="SandBox">
            <script __independent_subtree="1"
file="/CyberCat/SandBox/scripts/startup" name="SandBox-script"/>
          </fs>
        </lvm>
      </ip>
    </service>
    <service autostart="0" domain="cybercat" exclusive="0"
name="SandBoxRecovery">
      <ip address="192.168.110.174" monitor_link="on" sleeptime="1">
        <ip address="192.168.112.174" monitor_link="on" sleeptime="1"/>
        <lvm ref="VGa-SandBox">
          <fs ref="SandBox"/>
        </lvm>
      </ip>
    </service>
  </rm>
</cluster>
------------ / example cluster.conf ------------

PS, I also opened a ticket with GSS and waiting for their answer...