[Linux-cluster] How to take down a CS/GFS setup with minimum downtime

Lon Hohberger lhh at redhat.com
Mon Oct 29 14:56:50 UTC 2007


On Fri, 2007-10-26 at 15:29 +0000, Sævaldur Arnar Gunnarsson wrote:
> I've got five RHEL4 systems with CS and about 800 GB of data on a shared
> GFS filesystem.
> I've been tasked to take down the cluster and divide the content of the
> shared GFS filesystem to the local disks on each system with minimum
> downtime.
> 
> I've removed two nodes from the cluster already and am somewhat scared
> of a quorum problem if I remove another node.
> 
> From what I've been able to gather I should use cman_tool leave remove
> on a node once it is ready to leave the cluster and thus be able to
> remove four nodes from a five node cluster without disolving the quorum
> or risking losing access to the GFS data on the last remaining node.
> 
> Is that correct ?

That should do it, yes.  Leave remove is supposed to decrement the
quorum count, meaning you can go from 5..1 nodes if done correctly.  You
can verify that the expected votes count decreases with each removal
using 'cman_tool status'.


If for some reason the above doesn't work, the alternative looks
something like this:
  * unmount the GFS volume + stop cluster on all nodes
  * use gfs_tool to alter the lock proto to nolock
  * mount on node 1.  copy out data.  unmount!
  * mount on node 2.  copy out data.  unmount!
  * ...
  * mount on node 5.  copy out data.  unmount!

-- Lon




More information about the Linux-cluster mailing list