[Linux-cluster] Clvm over gnbd + rgmanager

Mon Mar 7 14:55:02 UTC 2005

Hi everybody,

I'm now trying to use the cluster logical volume manager.
                (/dev/hdb)
 debian--------------------------->buba (/dev/gnbd/dd)
  |             --GNBD---->
  |
  |________(/dev/hdb)____>gump(/dev/gnbd/dd)

I create on buba or gump a logical volume of 1G
(After launching the cluster and gnbd):
buba#pvcreate /dev/gnbd/dd
buba#vgcreate vg1 /dev/gnbd/dd
buba#lvcreate -L1024 -n lv1 vg1
and
#vgchange -a y  on the three nodes,
now the three nodes have /dev/vg1/lv1.

On one of the nodes I create en ext3 fs:
#mkfs.ext3 -j /dev/vg1/lv1

I launch the rgmanager, which has to put a basic script which writes the 
name of the node running the script on a file in the ext3 fs.
All works well until the syslog from the node running the script shows:

Mar  7 15:19:53 gump clurgmgrd[3978]: <notice> status on fs "my fs" 
returned 1 (generic error)
/*There starts the problem I don't know why status (isMounted in 
/usr/share/cluster/fs.sh) returns a failure code...)*/
Mar  7 15:19:53 gump clurgmgrd[3978]: <notice> Stopping resource group hello
Mar  7 15:19:55 gump clurgmgrd[3978]: <notice> Resource group hello is 
recovering
Mar  7 15:19:55 gump clurgmgrd[3978]: <notice> Recovering failed 
resource group hello
Mar  7 15:19:55 gump clurgmgrd[3978]: <notice> start on fs "my fs" 
returned 2 (invalid argument(s))
/*Syslog is wrong there because the fs.sh is not ocf compliant, in fs.sh 
exit 2 does not meen wrong argument, but FAIL*/
Mar  7 15:19:55 gump clurgmgrd[3978]: <warning> #68: Failed to start 
hello; return value: 1
Mar  7 15:19:55 gump clurgmgrd[3978]: <notice> Stopping resource group hello
Mar  7 15:19:57 gump clurgmgrd[3978]: <notice> Resource group hello is 
recovering
Mar  7 15:19:57 gump clurgmgrd[3978]: <warning> #71: Relocating failed 
resource group hello

and on the other node:
ar  7 15:23:14 buba clurgmgrd[5205]: <notice> start on script "Hello 
Script" returned 1 (generic error)
Mar  7 15:23:14 buba clurgmgrd[5205]: <warning> #68: Failed to start 
hello; return value: 1
Mar  7 15:23:14 buba clurgmgrd[5205]: <notice> Stopping resource group hello
Mar  7 15:23:16 buba clurgmgrd[5205]: <notice> Resource group hello is 
recoverin

Also at this point the fs is mounted on the two nodes, which must 
normally never happen...

Is this problem a bug from clvm or from the fs.sh script?
I tried a more simple prototype, with gndb only (the script mounts 
/dev/gnbd/dd on the nodes and writes there) and everything works well, 
so I think the problem comes from clvm.

There is my cluster.conf :
<?xml version="1.0"?>
<cluster name="cluster1" config_version="1">
  <clusternodes

    <clusternode name="buba" votes="1">
      <fence>
          <method name="single">
            <device name="human" ipaddr="200.0.0.10"/>
          </method>
      </fence>
    </clusternode>

    <clusternode name="gump" votes="1">
      <fence>
        <method name="single">
          <device name="human" ipaddr="200.0.0.97"/>
        </method>
      </fence>
    </clusternode>

    <clusternode name="debian" votes="1">
      <fence>
        <method name="single">
          <device name="human" ipaddr="200.0.0.102"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>

  <fencedevices>
    <fencedevice name="human" agent="fence_manual"/>
  </fencedevices>

  <rm>
    <failoverdomains>

       <failoverdomain name="hellodomain">
         <failoverdomainnode name="gump" priority="1"/>
         <failoverdomainnode name="buba" priority="1"/>
       </failoverdomain>

   </failoverdomains>
     <resources>
       <fs name="my fs" fstype="ext3" device="/dev/vg1/lv1" 
mountpoint="/mnt/gfs"/>
       <script name="Hello Script" file="/root/script/hello_v2.sh"/>
     </resources>

    <resourcegroup name="hello" domain="hellodomain" 
recovery="restart|relocate|disable">
      <fs ref="my fs">
         <script ref="Hello Script"/>
      </fs>
    </resourcegroup>
  </rm>
</cluster>

P.S:
I modified the fs.sh as Jiho Hahm suggered the 26.02.05 adding

 <child type="fs"/>
 <child type="script"/>

so that,
<fs ref="my fs">
  <script ref="Hello Script"/>
  </fs>
  is understood, and that the fs is mounted before the script is 
launched, and unmounted after the script is stopped.