[Cluster-devel] [PATCH 10/19] gfs2: Check for log write errors before telling dlm to unlock
Bob Peterson
rpeterso at redhat.com
Wed Mar 27 12:35:23 UTC 2019
Before this patch, function do_xmote just assumed all the writes
submitted to the journal were finished and successful, and it
called the go_unlock function to release the dlm lock. But if
they're not, and a revoke failed to make its way to the journal,
a journal replay on another node will cause corruption if we
let the go_inval function continue and tell dlm to release the
glock to another node. This patch adds a couple assert_withdraws
in do_xmote after the calls to go_sync and go_inval. The asserts
should cause another node to replay the journal before continuing,
thus protecting rgrp and dinode glocks and maintaining the
integrity of the metadata.
Signed-off-by: Bob Peterson <rpeterso at redhat.com>
---
fs/gfs2/glock.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 4996ab06e721..72a7b19c3aef 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -566,8 +566,12 @@ __acquires(&gl->gl_lockref.lock)
spin_unlock(&gl->gl_lockref.lock);
if (glops->go_sync)
glops->go_sync(gl);
+ gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_errors) == 0);
if (test_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags))
glops->go_inval(gl, target == LM_ST_DEFERRED ? 0 : DIO_METADATA);
+
+ if (!gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_errors) == 0))
+ gfs2_assert_withdraw(sdp, !atomic_read(&gl->gl_ail_count));
clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
gfs2_glock_hold(gl);
--
2.20.1
More information about the Cluster-devel
mailing list