[Cluster-devel] [PATCH dlm/next] fs: dlm: avoid ls_waiter_mutex circular lock dependency warning
Alexander Aring
aahringo at redhat.com
Wed Sep 29 17:10:54 UTC 2021
This patch adds comments about a very specific DLM behaviour which we
require to avoid a deadlock. The behaviour is about a state when new DLM
operations and processing of DLM messages are stopped. This DLM state
indicates that a certain group of function in DLM can't occur e.g.
any central locking logic stages functionality of fs/dlm/lock.c .
To check on this state a new assert "DLM_ASSERT_OPS_LOCKED" was introduced
to be sure that the required locks are held when this state is required.
One of the function where this state is required is
"dlm_recover_waiters_pre()", otherwise a deadlock can occur because a
different lock order between ls_waiters_mutex and res_mutex. A reverse
lock order of dlm_recover_waiters_pre() can't happen because the right
locks are hold to stop all "lock operations" as DLM_ASSERT_OPS_LOCKED()
checks on it. This patch moves the "ls_waiters_mutex" to a different
lockclass to avoid a lockdep warning about a possible deadlock.
The lock should be the same lockclass but so far I understand the only
way to tell lockdep to handle the lock in dlm_recover_waiters_pre()
differently than other ls_waiters_mutex lock calls.
It should avoid the following warning:
[ 619.855891] ======================================================
[ 619.856858] WARNING: possible circular locking dependency detected
[ 619.857865] 5.14.0-1.el9.x86_64+debug #1 Not tainted
[ 619.858646] ------------------------------------------------------
[ 619.859646] dlm_recoverd/3961 is trying to acquire lock:
[ 619.860478] ffff888019dcd628 (&r->res_mutex){+.+.}-{3:3}, at: _receive_unlock_reply+0x78/0x600 [dlm]
[ 619.861999]
[ 619.861999] but task is already holding lock:
[ 619.862933] ffff88800ee901a8 (&ls->ls_waiters_mutex){+.+.}-{3:3}, at: dlm_recover_waiters_pre+0x72/0xc80 [dlm]
[ 619.864529]
[ 619.864529] which lock already depends on the new lock.
[ 619.864529]
[ 619.865837]
[ 619.865837] the existing dependency chain (in reverse order) is:
[ 619.866993]
[ 619.866993] -> #1 (&ls->ls_waiters_mutex){+.+.}-{3:3}:
[ 619.868088] __lock_acquire+0xb72/0x1870
[ 619.868861] lock_acquire+0x1ca/0x570
[ 619.869554] __mutex_lock+0x14c/0x1170
[ 619.870283] add_to_waiters+0x6a/0x500 [dlm]
[ 619.871047] _request_lock+0x39f/0x9f0 [dlm]
[ 619.871860] request_lock.part.0+0x1ae/0x220 [dlm]
[ 619.872713] dlm_user_request+0x237/0x5a0 [dlm]
[ 619.873555] device_user_lock+0x42c/0x660 [dlm]
[ 619.874366] device_write+0x5ff/0x8d0 [dlm]
[ 619.875116] vfs_write+0x1c7/0x850
[ 619.875762] ksys_write+0xf9/0x1d0
[ 619.876385] do_syscall_64+0x3b/0x90
[ 619.877034] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 619.877972]
[ 619.877972] -> #0 (&r->res_mutex){+.+.}-{3:3}:
[ 619.878931] check_prev_add+0x15e/0x20f0
[ 619.879699] validate_chain+0xaba/0xde0
[ 619.880404] __lock_acquire+0xb72/0x1870
[ 619.881100] lock_acquire+0x1ca/0x570
[ 619.881823] __mutex_lock+0x14c/0x1170
[ 619.882506] _receive_unlock_reply+0x78/0x600 [dlm]
[ 619.883365] dlm_recover_waiters_pre+0x6e8/0xc80 [dlm]
[ 619.884262] ls_recover.isra.0+0x517/0x1090 [dlm]
[ 619.885087] dlm_recoverd+0x348/0x430 [dlm]
[ 619.885844] kthread+0x329/0x3e0
[ 619.886456] ret_from_fork+0x22/0x30
[ 619.887113]
[ 619.887113] other info that might help us debug this:
[ 619.887113]
[ 619.888376] Possible unsafe locking scenario:
[ 619.888376]
[ 619.889359] CPU0 CPU1
[ 619.890064] ---- ----
[ 619.890775] lock(&ls->ls_waiters_mutex);
[ 619.891436] lock(&r->res_mutex);
[ 619.892378] lock(&ls->ls_waiters_mutex);
[ 619.893436] lock(&r->res_mutex);
[ 619.893991]
[ 619.893991] *** DEADLOCK ***
[ 619.893991]
[ 619.894930] 3 locks held by dlm_recoverd/3961:
[ 619.895647] #0: ffff88800ee90d78 (&ls->ls_in_recovery){++++}-{3:3}, at: dlm_recoverd+0x1d1/0x430 [dlm]
[ 619.897173] #1: ffff88800ee90c68 (&ls->ls_recoverd_active){+.+.}-{3:3}, at: ls_recover.isra.0+0xf9/0x1090 [dlm]
[ 619.898759] #2: ffff88800ee901a8 (&ls->ls_waiters_mutex){+.+.}-{3:3}, at: dlm_recover_waiters_pre+0x72/0xc80 [dlm]
[ 619.900439]
[ 619.900439] stack backtrace:
[ 619.901145] CPU: 1 PID: 3961 Comm: dlm_recoverd Not tainted 5.14.0-1.el9.x86_64+debug #1
[ 619.902461] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 619.903390] Call Trace:
[ 619.903808] dump_stack_lvl+0x57/0x7d
[ 619.904493] check_noncircular+0x26a/0x310
[ 619.905155] ? print_circular_bug+0x1f0/0x1f0
[ 619.905839] ? alloc_chain_hlocks+0x1de/0x530
[ 619.906528] check_prev_add+0x15e/0x20f0
[ 619.907155] validate_chain+0xaba/0xde0
[ 619.907787] ? check_prev_add+0x20f0/0x20f0
[ 619.908489] __lock_acquire+0xb72/0x1870
[ 619.909147] lock_acquire+0x1ca/0x570
[ 619.909730] ? _receive_unlock_reply+0x78/0x600 [dlm]
[ 619.910554] ? rcu_read_unlock+0x40/0x40
[ 619.911183] ? __lock_acquired+0x1d2/0x8c0
[ 619.911826] ? dlm_recoverd+0x348/0x430 [dlm]
[ 619.912541] __mutex_lock+0x14c/0x1170
[ 619.913160] ? _receive_unlock_reply+0x78/0x600 [dlm]
[ 619.913997] ? _receive_unlock_reply+0x78/0x600 [dlm]
[ 619.914838] ? mutex_lock_io_nested+0xfc0/0xfc0
[ 619.915552] ? dlm_recover_waiters_pre+0x72/0xc80 [dlm]
[ 619.916380] ? io_schedule_timeout+0x150/0x150
[ 619.917072] ? mutex_lock_io_nested+0xfc0/0xfc0
[ 619.917833] ? lockdep_hardirqs_on_prepare.part.0+0x19a/0x350
[ 619.918738] ? _receive_unlock_reply+0x78/0x600 [dlm]
[ 619.919568] _receive_unlock_reply+0x78/0x600 [dlm]
[ 619.920352] dlm_recover_waiters_pre+0x6e8/0xc80 [dlm]
[ 619.921186] ls_recover.isra.0+0x517/0x1090 [dlm]
[ 619.921941] ? dlm_clear_toss+0x280/0x280 [dlm]
[ 619.922666] ? dlm_recoverd+0x33d/0x430 [dlm]
[ 619.923384] dlm_recoverd+0x348/0x430 [dlm]
[ 619.924053] ? ls_recover.isra.0+0x1090/0x1090 [dlm]
[ 619.924896] kthread+0x329/0x3e0
[ 619.925422] ? _raw_spin_unlock_irq+0x24/0x30
[ 619.926100] ? set_kthread_struct+0x100/0x100
[ 619.926788] ret_from_fork+0x22/0x30
Signed-off-by: Alexander Aring <aahringo at redhat.com>
---
fs/dlm/dlm_internal.h | 22 ++++++++++++++++++++++
fs/dlm/lock.c | 12 +++++++++++-
2 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/fs/dlm/dlm_internal.h b/fs/dlm/dlm_internal.h
index 49cf83e04c80..3fb610dfea19 100644
--- a/fs/dlm/dlm_internal.h
+++ b/fs/dlm/dlm_internal.h
@@ -746,6 +746,28 @@ static inline int dlm_no_directory(struct dlm_ls *ls)
return test_bit(LSFL_NODIR, &ls->ls_flags);
}
+#ifdef CONFIG_PROVE_LOCKING
+static inline int dlm_assert_ops_unlocked(struct dlm_ls *ls)
+{
+ int ret = 0;
+
+ /* will block all new DLM-API from user */
+ ret |= !rwsem_is_locked(&ls->ls_in_recovery);
+ /* assumed this flag was set while ls_recv_active was locked.
+ * This will make sure no receiving is processed and every
+ * further receivings are queued into the requestqueue.
+ */
+ ret |= !dlm_locking_stopped(ls);
+
+ return ret;
+}
+#define DLM_ASSERT_OPS_LOCKED(ls) \
+ DLM_ASSERT(dlm_assert_ops_unlocked(ls) == 0, \
+ printk("Locking active in ls: %s\n", ls->ls_name);)
+#else
+#define DLM_ASSERT_OPS_LOCKED(ls)
+#endif /* CONFIG_PROVE_LOCKING */
+
int dlm_netlink_init(void);
void dlm_netlink_exit(void);
void dlm_timeout_warn(struct dlm_lkb *lkb);
diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index c502c065d007..4231edb3c614 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -5125,7 +5125,17 @@ void dlm_recover_waiters_pre(struct dlm_ls *ls)
if (!ms_stub)
return;
- mutex_lock(&ls->ls_waiters_mutex);
+ /* we do mutex_lock_nested() here in a different lockclass to avoid
+ * a false positive report about a possible circular lock dependency.
+ * The lockclass should be the same but doing it avoids the false
+ * positive report. This report is about a different lock order
+ * regarding to ls_waiters_mutex lock and res_mutex of lkb between here
+ * and during lock operations. However lock operations cannot run in
+ * parallel with dlm_recover_waiters_pre() because certain locks are
+ * held which DLM_ASSERT_OPS_LOCKED(ls) is checking on.
+ */
+ DLM_ASSERT_OPS_LOCKED(ls);
+ mutex_lock_nested(&ls->ls_waiters_mutex, 1);
list_for_each_entry_safe(lkb, safe, &ls->ls_waiters, lkb_wait_reply) {
--
2.27.0
More information about the Cluster-devel
mailing list