[Linux-cluster] Removing a node from a running cluster

Mon Jan 8 18:20:36 UTC 2007

Next time, run "cman_tool leave" it has a few pre-req's so check the man page.
Then a "cman_tool expected vote_num" should sort out your quorum/votes.

graeme.

On 1/4/07, Pena, Francisco Javier <francisco_javier.pena at roche.com> wrote:
> Hello,
>
> I am finding a strange cman behavior when removing a node from a running cluster. The starting point is:
>
> - 3 nodes running RHEL 4 U4, GFS 6.1    (1 vote per node)
> - Quorum disk                                   (4 votes)
>
> I stop all cluster services on node 3, then modify the cluster.conf file to remove the node (and adjust the quorum disk votes to 3), and then "ccs_tool update" and "cman_tool version -r <new_version>". The cluster services keep running, however it looks like cman is not completely in sync with ccsd:
>
> # ccs_tool lsnode
>
> Cluster name: TestCluster, config_version: 9
>
> Nodename                        Votes Nodeid Iface Fencetype
> gfsnode1                           1    1          iLO_NODE1
> gfsnode2                           1    2          iLO_NODE2
>
>
> # cman_tool nodes
>
> Node  Votes Exp Sts  Name
>    0    4    0   M   /dev/emcpowera1
>    1    1    3   M   gfsnode1
>    2    1    3   M   gfsnode2
>    3    1    3   X   gfsnode3
>
> # cman_tool status
>
> Protocol version: 5.0.1
> Config version: 9
> Cluster name: TestCluster
> Cluster ID: 62260
> Cluster Member: Yes
> Membership state: Cluster-Member
> Nodes: 2
> Expected_votes: 3
> Total_votes: 6
> Quorum: 4
> Active subsystems: 9
> Node name: gfsnode1
> Node ID: 1
> Node addresses: A.B.C.D
>
> CMAN still thinks the third node is part of the cluster, but has just stopped working. In addition to that, it is not updating the number of votes for the quorum disk. If I completely restart the cluster services on all nodes, I get the right information:
>
> - Correct votes for the quorum disk
> - Third node dissappears
> - The Expected_votes value is now 2
>
> I know from a previous post that two node clusters are a special case, even with quorum disk, but I am pretty sure the same problem will happen with higher node counts (I just do not have enough hardware to test it).
>
> So, is this considered as a bug or is it expected that the information from removed nodes is still there until the whole cluster is restarted?
>
> Thanks in advance,
>
> Javier Peña
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>