[Linux-cluster] Handling of shutdown and mounting of GFS

Thu Sep 4 22:08:05 UTC 2008

Hello,

i try to get RHCS up and running and have some success so far.

The cluster with 2 nodes is running, but i don't know how to remove one
node the correct way. I can move the active service (an IP address by
now) to the second node and then want to remove the other node from the
running cluster. cman_tool leave remove should be used for this which is
recommended on the RH documentation. But if i try that i get the error
message:

root at store02:/etc/cluster# cman_tool leave remove
cman_tool: Error leaving cluster: Device or resource busy

I cannot figure out which device is busy so that the node is not able
to leave the cluster. The service (IP address) moved to the other node
correctly as i can see using clustat ...

The only way to get out of this problem is to restart the whole cluster
which brings down the service(s) and results in unnecessary fencing...
Is there a known way to remove one node from the cluster without
bringing down the whole cluster?

Another strange thing comes up when i try to use GFS:

i have configured DRBD on a backing HW Raid10 device, use LVM2 to build
a clusteraware VG, and on top of that use LVs and GFS across the two
cluster nodes.

Using the GFS filesystems without noauto in fstab doesn't mount the
filesystems on boot using /etc/init.d/gfs-tools. I think this is due to
the ordering the sysv init scripts are started. All RHCS stuff is
started from within rcS, and drbd is startet from within rc2. I read the
section of the debian-policy to figure out if rcS is meant to run before
rc2, but this isn't mentioned in the policy. So i assume that drbd is
started in rc2 after rcS, which would mean that every filesystem on top
of drbd is not able to mount on boot time...
Can anybody prove this?

The reason why i try to mount a GFS filesystem at boottime is that i
want to build cluster services on top of it, and that services (more
than one) are relying on one fs. A better solution would be to define
a shared GFS filesystem resource which could be used across more than
one cluster services, but the cluster take care that the filesystem is
only mounted once...
Can this be achieved with RHCS?

thanks for any advice ...

-- 
Ubuntu 8.04 LTS 64bit
RHCS 2.0
cluster.conf attached!
-------------- next part --------------
<?xml version="1.0"?>
<cluster alias="store" config_version="47" name="store">
	<fence_daemon post_fail_delay="0" post_join_delay="60"/>
	<clusternodes>
		<clusternode name="store01" nodeid="1" votes="1">
			<fence>
				<method name="1">
					<device name="ap7922" port="1"/>
				</method>
			</fence>
		</clusternode>
		<clusternode name="store02" nodeid="2" votes="1">
			<fence>
				<method name="1">
					<device name="ap7922" port="9"/>
				</method>
			</fence>
		</clusternode>
	</clusternodes>
	<cman expected_votes="1" two_node="1"/>
	<fencedevices>
		<fencedevice agent="fence_apc" ipaddr="192.168.2.10" login="***" name="ap7922" passwd="***"/>
	</fencedevices>
	<rm>
		<failoverdomains>
			<failoverdomain name="ip-fail2node2" ordered="1" restricted="0">
				<failoverdomainnode name="store01" priority="1"/>
			</failoverdomain>
			<failoverdomain name="ip-fail2node1" ordered="1">
				<failoverdomainnode name="store02" priority="1"/>
			</failoverdomain>
		</failoverdomains>
		<resources>
			<ip address="192.168.2.20" monitor_link="1"/>
			<ip address="192.168.2.23" monitor_link="1"/>
		</resources>
		<service autostart="1" domain="ip-fail2node1" name="store02-ip" recovery="restart">
			<ip ref="192.168.2.23"/>
		</service>
		<service autostart="1" domain="ip-fail2node2" name="store01-ip" recovery="restart">
			<ip ref="192.168.2.20"/>
		</service>
	</rm>
</cluster>