[dm-devel] [PATCH 2/2] multipathd: handle errors in uxlsnr as fatal

Chongyun Wu wu.chongyun at h3c.com
Wed Mar 21 02:43:58 UTC 2018


On 2018/3/21 0:51, Martin Wilck wrote:
> The ppoll() calls of the uxlsnr thread are vital for proper functioning of
> multipathd. If the uxlsnr thread can't open the socket or fails to call ppoll()
> for other reasons, quit the daemon. If we don't do that, multipathd may
> hang in a state where it can't be terminated any more, because the uxlsnr
> thread is responsible for handling all signals. This happens e.g. if
> systemd's multipathd.socket is running in and multipathd is started from
> outside systemd.
> 
> 24f2844 "multipathd: fix signal blocking logic" has made this problem more
> severe. Before that patch, the signals weren't actually blocked in any thread.
> That's not to say 24f2844 was wrong. I still think it's correct, we just
> need this one on top.
> 
> Signed-off-by: Martin Wilck <mwilck at suse.com>
> ---
>   multipathd/uxlsnr.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
> index cdafd82943e7..6f666663fc6f 100644
> --- a/multipathd/uxlsnr.c
> +++ b/multipathd/uxlsnr.c
> @@ -178,7 +178,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, void * trigger_data)
>   
>   	if (ux_sock == -1) {
>   		condlog(1, "could not create uxsock: %d", errno);
> -		return NULL;
> +		exit_daemon();
>   	}
>   
>   	pthread_cleanup_push(uxsock_cleanup, (void *)ux_sock);
> @@ -187,7 +187,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, void * trigger_data)
>   	polls = (struct pollfd *)MALLOC((MIN_POLLS + 1) * sizeof(struct pollfd));
>   	if (!polls) {
>   		condlog(0, "uxsock: failed to allocate poll fds");
> -		return NULL;
> +		exit_daemon();
>   	}
>   	sigfillset(&mask);
>   	sigdelset(&mask, SIGINT);
> @@ -249,6 +249,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, void * trigger_data)
>   
>   			/* something went badly wrong! */
>   			condlog(0, "uxsock: poll failed with %d", errno);
> +			exit_daemon();
>   			break;
>   		}
Hi Martin,

Your analysis is reasonable. It is necessary to deal with fatal error 
not only to return, if not doing this multipathd can't exit normally and 
multipathd commands can't work any more. I think your patch is OK, but I 
have some ideas inspired by your patch.
Calling exit_daemon() is to shut down the multipathd, relay on the 
outside to pull multipathd again. Is there a function can be use to deal 
with fatal error? Its function are close the socket(if create 
successfully before) and create a new socket to make uxlsnr thread work 
properly again or continue to create uxsocket? This function actually is 
try to repair those errors.
It just an idea, maybe not quite right.

Regards,
Chongyun






More information about the dm-devel mailing list