[Linux-cluster] Problem with shared storage and volume groups

michael.osullivan at auckland.ac.nz michael.osullivan at auckland.ac.nz
Sat Sep 20 00:11:09 UTC 2008

Hi Chrissie,

Thanks for your help. I have managed to get my cluster up and running and
the volume group is now visible on both cluster nodes. I suspect that I
clvmd was not running properly, although I am really not sure. However,
after starting up the cluster this afternoon and starting rgmanager and
clvmd on both nodes the volume group (iscsi_raid_vg) is there. However, no
I try to create a logical volume to use for my GFS

lvcreate -L 19.07G iscsi_raid_vg

and I get the error

Error locking on node <other_node>: Error backing up metadata, can't find
VG for group #global
Aborting. Failed to activate new LV to wipe the start of it.

Any suggestions?

Also, I have been trying to configure my 2-node cluster using conga, but
have resorted to command line control as conga seems to hang quite
frequently. I have read the Red Hat Manuals, but there is a lot of
material and nothing that seems to cover the problems I keep encountering.
I think the main issue is that I have shared storage presented via iSCSI
to the cluster. I have to power down every night and power back up every
morning. I then have to use mdadm to reassemble the iSCSI storage (due to
multipathing and RAID). Then I have to get the cluster back up and
running. I am getting better at the process, but I don't think I am
powering down the cluster in a "nice" way, so I get issues when I try to
start it up again. Do you have any suggestions for a shutdown, startup
sequence that will get the cluster up and going nicely? Currently, the
sequence is something like:

# Shutdown
service clvmd stop
service rgmanager stop
service cman stop
# Shut the machine down

# Startup
iptables -F # Isolated experimental network, so I am pretty free here
mdadm --assemble --scan # Sometimes I will have to rebuild the md devices
service cman start
service rgmanager start
service clvmd start
# Do I need to cman_tool join here? Or fence_ack_manual -n <node>?

Sorry for all the questions, but I am only building a prototype at this
stage and would like to get it "down and up" as easily as possible so I
can do some testing.

Kind regards, Mike

