[Linux-cluster] I/O to gfs2 hanging or not hanging after heartbeat loss
Jonathan Davies
jonathan.davies at citrix.com
Mon Apr 18 13:12:58 UTC 2016
On 15/04/16 17:14, David Teigland wrote:
>>> However, on some occasions, I observe that node A continues in the loop
>>> believing that it is successfully writing to the file
>
> node A has the exclusive lock, so it continues writing...
>
>>> but, according to
>>> node C, the file stops being updated. (Meanwhile, the file written by
>>> node B continues to be up-to-date as read by C.) This is concerning --
>>> it looks like I/O writes are being completed on node A even though other
>>> nodes in the cluster cannot see the results.
>
> Is node C blocked trying to read the file A is writing? That what we'd
> expect until recovery has removed node A. Or are C's reads completing
> while A continues writing the file? That would not be correct.
>
>> However, if A happens to own the DLM lock, it does not need
>> to ask DLM's permission because it owns the lock. Therefore, it goes
>> on writing. Meanwhile, the other node can't get DLM's permission to
>> get the lock back, so it hangs.
>
> The description sounds like C might not be hanging in read as we'd expect
> while A continues writing. If that's the case, then it implies that dlm
> recovery has been completed by nodes B and C (removing A), which allows
> the lock to be granted to C for reading. If dlm recovery on B/C has
> completed, it means that A should have been fenced, so A should not be
> able to write once C is given the lock.
Thanks Bob and Dave for your very helpful insights.
Your line of reasoning led me to realise that I am running dlm with
fencing disabled, which explains everything. Node C was not hanging in
read while A continued to write; it was constantly returning an old
value. I presume that's legitimate as C believes the value it saw last
must still be up-to-date because A must have been fenced so couldn't
have updated it. (It also explains why I didn't see anything useful in
the logs.)
When I run the same test with fencing enabled then, although A continues
writing after the failure, the read on C hangs until A is fenced, at
which point it is able to read the last value A wrote. That's exactly
what I want.
Apologies for the noise, and thanks for the explanations.
Jonathan
More information about the Linux-cluster
mailing list