[Cluster-devel] [PATCH dlm/next] fs: dlm: fix possible stuck in shutdown

Alexander Aring aahringo at redhat.com
Thu Aug 26 14:06:31 UTC 2021


In case of releasing the last lockspace we make a clean shutdown to
synchronize cluster manager membership with the dlm kernel protocol.
In some cases when a passive shutdown (membership removal during
lockspace operational) is interrupted by a active shutdown (releasing
last lockspace) this synchronize can stuck because the active shutdown
waits until passive shutdown is done. In case of passive shutdown we
wait for the membership remove event. If active shutdown interrupts
passive shutdown and such event didn't happened before the shutdown gets
stuck because dlm recovery (which handles cluster manager membership
events) is already stopped. To avoid it we trigger the event before
calling the active shutdown so a possible wait for a passive shutdown
can wake up the shutdown waiter.

Reported-by: Chris Mackowski <cmackows at redhat.com>
Signed-off-by: Alexander Aring <aahringo at redhat.com>
---
 fs/dlm/lockspace.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
index 23c2d7308050..10eddfa6c3d7 100644
--- a/fs/dlm/lockspace.c
+++ b/fs/dlm/lockspace.c
@@ -793,6 +793,7 @@ static int release_lockspace(struct dlm_ls *ls, int force)
 
 	if (ls_count == 1) {
 		dlm_scand_stop();
+		dlm_clear_members(ls);
 		dlm_midcomms_shutdown();
 	}
 
-- 
2.27.0




More information about the Cluster-devel mailing list