[Linux-cluster] Clvm over gnbd + rgmanager
Ion Alberdi
ialberdi at histor.fr
Mon Mar 7 14:55:02 UTC 2005
Hi everybody,
I'm now trying to use the cluster logical volume manager.
(/dev/hdb)
debian--------------------------->buba (/dev/gnbd/dd)
| --GNBD---->
|
|________(/dev/hdb)____>gump(/dev/gnbd/dd)
I create on buba or gump a logical volume of 1G
(After launching the cluster and gnbd):
buba#pvcreate /dev/gnbd/dd
buba#vgcreate vg1 /dev/gnbd/dd
buba#lvcreate -L1024 -n lv1 vg1
and
#vgchange -a y on the three nodes,
now the three nodes have /dev/vg1/lv1.
On one of the nodes I create en ext3 fs:
#mkfs.ext3 -j /dev/vg1/lv1
I launch the rgmanager, which has to put a basic script which writes the
name of the node running the script on a file in the ext3 fs.
All works well until the syslog from the node running the script shows:
Mar 7 15:19:53 gump clurgmgrd[3978]: <notice> status on fs "my fs"
returned 1 (generic error)
/*There starts the problem I don't know why status (isMounted in
/usr/share/cluster/fs.sh) returns a failure code...)*/
Mar 7 15:19:53 gump clurgmgrd[3978]: <notice> Stopping resource group hello
Mar 7 15:19:55 gump clurgmgrd[3978]: <notice> Resource group hello is
recovering
Mar 7 15:19:55 gump clurgmgrd[3978]: <notice> Recovering failed
resource group hello
Mar 7 15:19:55 gump clurgmgrd[3978]: <notice> start on fs "my fs"
returned 2 (invalid argument(s))
/*Syslog is wrong there because the fs.sh is not ocf compliant, in fs.sh
exit 2 does not meen wrong argument, but FAIL*/
Mar 7 15:19:55 gump clurgmgrd[3978]: <warning> #68: Failed to start
hello; return value: 1
Mar 7 15:19:55 gump clurgmgrd[3978]: <notice> Stopping resource group hello
Mar 7 15:19:57 gump clurgmgrd[3978]: <notice> Resource group hello is
recovering
Mar 7 15:19:57 gump clurgmgrd[3978]: <warning> #71: Relocating failed
resource group hello
and on the other node:
ar 7 15:23:14 buba clurgmgrd[5205]: <notice> start on script "Hello
Script" returned 1 (generic error)
Mar 7 15:23:14 buba clurgmgrd[5205]: <warning> #68: Failed to start
hello; return value: 1
Mar 7 15:23:14 buba clurgmgrd[5205]: <notice> Stopping resource group hello
Mar 7 15:23:16 buba clurgmgrd[5205]: <notice> Resource group hello is
recoverin
Also at this point the fs is mounted on the two nodes, which must
normally never happen...
Is this problem a bug from clvm or from the fs.sh script?
I tried a more simple prototype, with gndb only (the script mounts
/dev/gnbd/dd on the nodes and writes there) and everything works well,
so I think the problem comes from clvm.
There is my cluster.conf :
<?xml version="1.0"?>
<cluster name="cluster1" config_version="1">
<clusternodes
<clusternode name="buba" votes="1">
<fence>
<method name="single">
<device name="human" ipaddr="200.0.0.10"/>
</method>
</fence>
</clusternode>
<clusternode name="gump" votes="1">
<fence>
<method name="single">
<device name="human" ipaddr="200.0.0.97"/>
</method>
</fence>
</clusternode>
<clusternode name="debian" votes="1">
<fence>
<method name="single">
<device name="human" ipaddr="200.0.0.102"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice name="human" agent="fence_manual"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="hellodomain">
<failoverdomainnode name="gump" priority="1"/>
<failoverdomainnode name="buba" priority="1"/>
</failoverdomain>
</failoverdomains>
<resources>
<fs name="my fs" fstype="ext3" device="/dev/vg1/lv1"
mountpoint="/mnt/gfs"/>
<script name="Hello Script" file="/root/script/hello_v2.sh"/>
</resources>
<resourcegroup name="hello" domain="hellodomain"
recovery="restart|relocate|disable">
<fs ref="my fs">
<script ref="Hello Script"/>
</fs>
</resourcegroup>
</rm>
</cluster>
P.S:
I modified the fs.sh as Jiho Hahm suggered the 26.02.05 adding
<child type="fs"/>
<child type="script"/>
so that,
<fs ref="my fs">
<script ref="Hello Script"/>
</fs>
is understood, and that the fs is mounted before the script is
launched, and unmounted after the script is stopped.
More information about the Linux-cluster
mailing list