[Linux-cluster] GFS2 2 Node Cluster - lost Node - Mount not writeable

Wed Feb 27 14:34:58 UTC 2008

Thomas Börnert wrote:
> Hi List,
>
> 2 Servers - connected with crossover
>
> my rpms:
> gfs2-utils-0.1.38-1.el5
> gfs-utils-0.1.12-1.el5
> kmod-gfs2-1.52-1.16.el5
> cman-2.0.73-1.el5_1.1
>
> my cluster.conf on both sites
> ---------------------------------------------------------------------------------
> <?xml version="1.0"?>
> <cluster name="cluster" config_version="2">
> <cman two_node="1" expected_votes="1">
> </cman>
> <clusternodes>
>
> <clusternode name="node1" votes="1" nodeid="1">
>          <fence>
>                 <method name="human">
>                         <device name="human" nodename="node1"/>
>                 </method>
>         </fence>
> </clusternode>
>
> <clusternode name="node2" votes="1" nodeid="2">
>          <fence>
>                 <method name="human">
>                         <device name="human" nodename="node2"/>
>                 </method>
>         </fence>
> </clusternode>
> </clusternodes>
>
> <fencedevices>
>         <fencedevice name="human" agent="fence_manual"/>
> </fencedevices>
> </cluster>
> ---------------------------------------------------------------------------------------
> my hosts on both sites
> 192.168.0.1	node1
> 192.168.0.2	node2
>
> my mountpoints
> mkfs.gfs2 -p lock_dlm -t cluster:drbd -j 2 /dev/drbd0
> mount -t gfs2 -o noatime,nodiratime /dev/drbd0 /test
> (Btw: => drbd works fine as Primary/Primary)
>
> ok, i can use /test on both sites and can write to files
> and so on.
>
> cman_tool nodes
> --------------------------------------------------------------------------------------
> Node  Sts   Inc   Joined               Name
>    1   M    364   2008-02-26 23:20:16  node1
>    2   M    360   2008-02-26 23:20:16  node2
>
> cman_tool status
> -------------------------------------------------------------------------------------
> Version: 6.0.1
> Config Version: 3
> Cluster Name: cluster
> Cluster Id: 34996
> Cluster Member: Yes
> Cluster Generation: 364
> Membership state: Cluster-Member
> Nodes: 2
> Expected votes: 1
> Total votes: 2
> Quorum: 1  
> Active subsystems: 6
> Flags: 2node 
> Ports Bound: 0  
> Node name: node2
> Node ID: 2
> Multicast addresses: 239.192.136.61 
> Node addresses: 192.168.0.2
>
> NOW: i power node1 off !
>
> my log on node2 shows:
> -----------------------------------------------------------------------------------------
> ==> /var/log/messages <==
> Feb 26 23:27:22 node2 last message repeated 13 times
>
> ==> /var/log/kernel <==
> Feb 26 23:27:31 node2 kernel: tg3: eth1: Link is down.
> Feb 26 23:27:32 node2 kernel: tg3: eth1: Link is up at 100 Mbps, full duplex.
> Feb 26 23:27:32 node2 kernel: tg3: eth1: Flow control is off for TX and off 
> for RX.
> Feb 26 23:27:36 node2 kernel: drbd0: PingAck did not arrive in time.
> Feb 26 23:27:36 node2 kernel: drbd0: peer( Primary -> Unknown ) conn( 
> Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
> Feb 26 23:27:36 node2 kernel: drbd0: Creating new current UUID
> Feb 26 23:27:36 node2 kernel: drbd0: asender terminated
> Feb 26 23:27:36 node2 kernel: drbd0: short read expecting header on sock: 
> r=-512
> Feb 26 23:27:36 node2 kernel: drbd0: tl_clear()
> Feb 26 23:27:36 node2 kernel: drbd0: Connection closed
> Feb 26 23:27:36 node2 kernel: drbd0: Writing meta data super block now.
> Feb 26 23:27:36 node2 kernel: drbd0: conn( NetworkFailure -> Unconnected )
> Feb 26 23:27:36 node2 kernel: drbd0: receiver terminated
> Feb 26 23:27:36 node2 kernel: drbd0: receiver (re)started
> Feb 26 23:27:36 node2 kernel: drbd0: conn( Unconnected -> WFConnection )
>
> ==> /var/log/messages <==
> Feb 26 23:27:37 node2 last message repeated 3 times
> Feb 26 23:27:40 node2 openais[3288]: [TOTEM] The token was lost in the 
> OPERATIONAL state.
> Feb 26 23:27:40 node2 openais[3288]: [TOTEM] Receive multicast socket recv 
> buffer size (288000 bytes).
> Feb 26 23:27:40 node2 openais[3288]: [TOTEM] Transmit multicast socket send 
> buffer size (262142 bytes).
> Feb 26 23:27:40 node2 openais[3288]: [TOTEM] entering GATHER state from 2.
> Feb 26 23:27:42 node2 root: Process did not exit cleanly, returned 2 with 
> signal 0
> Feb 26 23:27:44 node2 openais[3288]: [TOTEM] entering GATHER state from 0.
> Feb 26 23:27:44 node2 openais[3288]: [TOTEM] Creating commit token because I 
> am the rep.
> Feb 26 23:27:44 node2 openais[3288]: [TOTEM] Saving state aru 31 high seq 
> received 31
> Feb 26 23:27:44 node2 openais[3288]: [TOTEM] Storing new sequence id for ring 
> 170
> Feb 26 23:27:44 node2 openais[3288]: [TOTEM] entering COMMIT state.
> Feb 26 23:27:44 node2 openais[3288]: [TOTEM] entering RECOVERY state.
> Feb 26 23:27:44 node2 openais[3288]: [TOTEM] position [0] member 192.168.0.2:
> Feb 26 23:27:44 node2 openais[3288]: [TOTEM] previous ring seq 364 rep 
> 192.168.0.1
> Feb 26 23:27:44 node2 openais[3288]: [TOTEM] aru 31 high delivered 31 received 
> flag 1
> Feb 26 23:27:44 node2 openais[3288]: [TOTEM] Did not need to originate any 
> messages in recovery.
> Feb 26 23:27:44 node2 openais[3288]: [TOTEM] Sending initial ORF token
> Feb 26 23:27:44 node2 openais[3288]: [CLM  ] CLM CONFIGURATION CHANGE
> Feb 26 23:27:44 node2 openais[3288]: [CLM  ] New Configuration:
> Feb 26 23:27:44 node2 fenced[3307]: node1 not a cluster member after 0 sec 
> post_fail_delay
> Feb 26 23:27:44 node2 openais[3288]: [CLM  ]       r(0) ip(192.168.0.2)
> Feb 26 23:27:44 node2 fenced[3307]: fencing node "node1"
>
> ==> /var/log/kernel <==
> Feb 26 23:27:44 node2 kernel: dlm: closing connection to node 1
>
> ==> /var/log/messages <==
> Feb 26 23:27:44 node2 openais[3288]: [CLM  ] Members Left:
> Feb 26 23:27:45 node2 openais[3288]: [CLM  ]       r(0) ip(192.168.0.1)
> Feb 26 23:27:45 node2 fence_manual: Node node1 needs to be reset before 
> recovery can procede.  Waiting for node1 to rejoin the cluster or for manual 
> acknowledgement that it has been reset (i.e. fence_ack_manual -n node1)
>   
Note this message...

> Feb 26 23:27:45 node2 openais[3288]: [CLM  ] Members Joined:
> Feb 26 23:27:45 node2 openais[3288]: [CLM  ] CLM CONFIGURATION CHANGE
> Feb 26 23:27:45 node2 openais[3288]: [CLM  ] New Configuration:
> Feb 26 23:27:45 node2 openais[3288]: [CLM  ]       r(0) ip(192.168.0.2)
> Feb 26 23:27:45 node2 openais[3288]: [CLM  ] Members Left:
> Feb 26 23:27:45 node2 openais[3288]: [CLM  ] Members Joined:
> Feb 26 23:27:45 node2 openais[3288]: [SYNC ] This node is within the primary 
> component and will provide service.
> Feb 26 23:27:45 node2 openais[3288]: [TOTEM] entering OPERATIONAL state.
> Feb 26 23:27:45 node2 openais[3288]: [CLM  ] got nodejoin message 192.168.0.2
> Feb 26 23:27:45 node2 openais[3288]: [CPG  ] got joinlist message from node 2
> Feb 26 23:27:47 node2 root: Process did not exit cleanly, returned 2 with 
> signal 0
> -------------------------------------------------------------------------------------------------------------
>
> ls /test works
>
> BUT
>
> touch /test/testfile hangs ....
>
> cman_tool nodes shows
> ------------------------------------------------------------------------------------------------------------------
> Node  Sts   Inc   Joined               Name
>    1   X    364                        node1
>    2   M    360   2008-02-26 23:20:16  node2
> -----------------------------------------------------------------------------------------------------------------
>
> cman_tool status shows
> -----------------------------------------------------------------------------------------------------------------
> Version: 6.0.1
> Config Version: 3
> Cluster Name: cluster
> Cluster Id: 34996
> Cluster Member: Yes
> Cluster Generation: 368
> Membership state: Cluster-Member
> Nodes: 1
> Expected votes: 1
> Total votes: 1
> Quorum: 1  
> Active subsystems: 6
> Flags: 2node 
> Ports Bound: 0  
> Node name: node2
> Node ID: 2
> Multicast addresses: 239.192.136.61 
> Node addresses: 192.168.0.2
> ------------------------------------------------------------------------------------------------------------------
>
> my drbd is no problem state is already primary (standalone)
>
> Why can't i write to a gfs partition in the "lost Node" state ?
>
> Now: i power node1 on !
>
> drbd is no problem -> its recovered.
> now i start cman
> and my touch will be finished ....
>
> Thanks for any ideas and help
>
> -Thomas
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>   

This is because you are using manual fencing.  Fencing is required to 
ensure that an errant node does not continue to write to the shared 
filesystem after it has lost communication with the cluster, thereby 
corrupting the data.  The only way to do this is to halt all cluster 
activity (including granting GFS locks) until the fencing succeeds.   
The "manual" means that an administrator must intervene and correct the 
problem before cluster operations can resume.  So when you power off 
node1, node2 detects missed heartbeats and fences node1.  Now you must 
manually fence node1 by powering it off (this is already done in your 
case) then do one of the following:

     1) Run the following command to acknowledge that you have manually 
fenced the node

               # /sbin/fence_ack_manual node1

         OR

     2) Start node1 back up and have it rejoin the cluster

The danger with manual fencing comes in when you quickly run 
fence_ack_manual without properly investigating the issue or fencing the 
node.  You may see that the fenced node is still up and quickly run that 
command without noticing that the network connection has been lost.  Now 
the nodes proceed with writing to GFS without being able to communicate 
and they quickly corrupt the data. 

So, when using manual fencing always take caution before running 
fence_ack_manual. 

John