[Linux-cluster] I/O to gfs2 hanging or not hanging after heartbeat loss
jonathan.davies at citrix.com
Fri Apr 15 14:55:02 UTC 2016
I have made some observations about the behaviour of gfs2 and would
appreciate confirmation of whether this is expected behaviour or
something has gone wrong.
I have a three-node cluster -- let's call the nodes A, B and C. On each
of nodes A and B, I have a loop that repeatedly writes an increasing
integer value to a file in the GFS2-mountpoint. On node C, I have a loop
that reads from both these files from the GFS2-mountpoint. The reads on
node C show the latest values written by A and B, and stay up-to-date.
All good so far.
I then cause node A to drop the corosync heartbeat by executing the
following on node A:
iptables -I INPUT -p udp --dport 5404 -j DROP
iptables -I INPUT -p udp --dport 5405 -j DROP
iptables -I INPUT -p tcp --dport 21064 -j DROP
After a few seconds, I normally observe that all I/O to the GFS2
filesystem hangs forever on node A: the latest value read by node C is
the same as the last successful write by node A. This is exactly the
behaviour I want -- I want to be sure that node A never completes I/O
that is not able to be seen by other nodes.
However, on some occasions, I observe that node A continues in the loop
believing that it is successfully writing to the file but, according to
node C, the file stops being updated. (Meanwhile, the file written by
node B continues to be up-to-date as read by C.) This is concerning --
it looks like I/O writes are being completed on node A even though other
nodes in the cluster cannot see the results.
I performed this test 20 times, rebooting node A between each, and saw
the "I/O hanging" behaviour 16 times and the "I/O appears to continue"
behaviour 4 times. I couldn't see anything that might cause it to
sometimes adopt one behaviour and sometimes the other.
So... is this expected? Should I be able to rely upon I/O hanging? Or
have I misconfigured something? Advice would be appreciated.
* The I/O from node A uses an fd that is O_DIRECT|O_SYNC, so the page
cache is not involved.
* Versions: corosync 2.3.4, dlm_controld 4.0.2, gfs2 as per RHEL 7.2.
* I don't see anything particularly useful being logged. Soon after I
insert the iptables rules on node A, I see the following on node A:
2016-04-15T14:15:45.608175+00:00 localhost corosync: [TOTEM ] The
token was lost in the OPERATIONAL state.
2016-04-15T14:15:45.608191+00:00 localhost corosync: [TOTEM ] A
processor failed, forming new configuration.
2016-04-15T14:15:45.608198+00:00 localhost corosync: [TOTEM ]
entering GATHER state from 2(The token was lost in the OPERATIONAL state.).
Around the time node C sees the output from node A stop changing, node A
2016-04-15T14:15:58.388404+00:00 localhost corosync: [TOTEM ]
entering GATHER state from 0(consensus timeout).
More information about the Linux-cluster