[dm-devel] [PATCH v3 1/2] libmultipath: fix race in stop_io_err_stat_thread

Martin Wilck mwilck at suse.com
Wed Mar 7 14:07:32 UTC 2018


It's wrong, and unnecessary, to call pthread_kill() after
pthread_cancel(). I have observed cases where the io_err checker
thread hung in libpthread after receiving the USR2 signal, in particular
when multipathd is run under strace. (If multipathd is killed with
SIGINT under strace, and the io_error thread is running, it happens
almost every time). If this happens, the io_err thread
tries to obtain a mutex in the urcu code (presumably rcu_unregister_thread())
and the main thread hangs in pthread_join(). multipathd can only be
terminated with kill -KILL in this situation.

With the change from this patch, the thread is shut down cleanly. I haven't
observed the hang under strace with the patch.

Fixes: 95d594fd "multipath-tools: intermittent IO error accounting to improve
reliability"

Signed-off-by: Martin Wilck <mwilck at suse.com>
---
 libmultipath/io_err_stat.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/libmultipath/io_err_stat.c b/libmultipath/io_err_stat.c
index 00bac9e0e755..536ba87968fd 100644
--- a/libmultipath/io_err_stat.c
+++ b/libmultipath/io_err_stat.c
@@ -749,7 +749,6 @@ destroy_ctx:
 void stop_io_err_stat_thread(void)
 {
 	pthread_cancel(io_err_stat_thr);
-	pthread_kill(io_err_stat_thr, SIGUSR2);
 	pthread_join(io_err_stat_thr, NULL);
 	free_io_err_pathvec(paths);
 	io_destroy(ioctx);
-- 
2.16.1




More information about the dm-devel mailing list