[Linux-cluster] How to take down a CS/GFS setup with minimumdowntime

Wed Nov 7 16:43:33 UTC 2007

Lon, "leave remove" works as advertised, but is there a way (i.e.,
parameter) to do the same thing when a downed node re-joins the cluster
automagically?  If I down more than one node using the default "leave
remove", it decrements each instance properly, and maintains quorum.
But if I startup just one of those nodes later, the quorum count jumps
all the way back to its modus operandi value in cluster.conf, and in
some rare cases, I could no longer have quorum!

Example:

11 nodes (votes are 2 nodes @ 5 each, plus 9 nodes @ 1 each, expected =
19, quorum = 10)

"leave remove" 1 node @ 5 votes, quorum re-calculates as 8, total = 14
"leave remove" 1 node @ 1 vote, quorum re-calculates as 7, total = 13
"leave remove" 1 node @ 1 vote, quorum re-calculates as 7, total = 12
"leave remove" 1 node @ 1 vote, quorum re-calculates as 6, total = 11
"leave remove" 1 node @ 1 vote, quorum re-calculates as 6, total = 10
"leave remove" 1 node @ 1 vote, quorum re-calculates as 5, total = 9
"leave remove" 1 node @ 1 vote, quorum re-calculates as 5, total = 8

cman join 1 node @ 1 vote, quorum re-calculates as 10, total = 9,
inquorate!!

Please advise, thanks.

On Wed, 2007-11-07 at 11:13 +0000, Sævaldur Arnar Gunnarsson wrote:

> Thanks for this Lon, I'm down to the last two node members and according
> to cman_tool status I have two nodes, two votes and a quorum of two.
> --
> Nodes: 2
> Expected_votes: 5
> Total_votes: 2
> Quorum: 2   
> --
> 
> One of those nodes has the GFS filesystems mounted.
> If I issue cman_tool leave remove on the other node will I run into any
> problems on the GFS mounted node ? (for example, due to quorum)
> 
> 
> 
> On Mon, 2007-10-29 at 10:56 -0400, Lon Hohberger wrote:
> 
> > That should do it, yes.  Leave remove is supposed to decrement the
> > quorum count, meaning you can go from 5..1 nodes if done correctly.  You
> > can verify that the expected votes count decreases with each removal
> > using 'cman_tool status'.
> > 
> > 
> > If for some reason the above doesn't work, the alternative looks
> > something like this:
> >   * unmount the GFS volume + stop cluster on all nodes
> >   * use gfs_tool to alter the lock proto to nolock
> >   * mount on node 1.  copy out data.  unmount!
> >   * mount on node 2.  copy out data.  unmount!
> >   * ...
> >   * mount on node 5.  copy out data.  unmount!
> > 
> > -- Lon
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20071107/7bbd827b/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2178 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20071107/7bbd827b/attachment.p7s>