[dm-devel] [PATCH v2] multipathd: release uxsocket and resource when cancel thread

Wuchongyun wu.chongyun at h3c.com
Mon Jan 15 09:25:39 UTC 2018


Hi Martin, 
Thank you for your reply!  Below is the new patch according to your comments, please help to review this patch, thanks.

Issue description: we meet this issue: when multipathd initilaze and call uxsock_listen to create unix domain socket, but return -1 and the errno is 98 and then the uxsock_listen return null. After multipathd startup we can't receive any user's multipathd commands to finish the new multipath creation or any operations any more!

We found that uxlsnr thread's cleanup function not close the sockets also not release the clients when cancel thread, the domain socket will be release by the system. In any special environment like the machine's load is very heavy or any situations, the system may not close the old domain socket when we try to create and bind the new domain socket may return errno:98(Address already in use).

And also we make some experiments:
in uxsock_cleanup if we close the ux_sock first and then immdediately call ux_socket_listen to create new ux_sock and initialization will be OK; if we don't close the ux_sock and call ux_socket_listen will return
-1 and errno = 98.

So we believe that close uxsocket and release clients  when cancel thread can make sure of that new starting multipathd thread can create new uxsocket successfully, also can receive multipathd commands properly. And this path can fix clients' memory leak too.

Signed-off-by: Chongyun Wu <wu.chongyun at h3c.com>
---
 multipathd/uxlsnr.c |   17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c index 98ac25a..ca6bf5b 100644
--- a/multipathd/uxlsnr.c
+++ b/multipathd/uxlsnr.c
@@ -139,6 +139,21 @@ void check_timeout(struct timespec start_time, char *inbuf,
 
 void uxsock_cleanup(void *arg)
 {
+	struct client *client_loop;
+	struct client *client_tmp;
+	int ux_sock = (int)arg;
+
+	pthread_mutex_lock(&client_lock);
+	list_for_each_entry_safe(client_loop, client_tmp, &clients, node) {
+		list_del_init(&client_loop->node);
+		close(client_loop->fd);
+		client_loop->fd = -1;
+		FREE(client_loop);
+	}
+	pthread_mutex_unlock(&client_lock);
+
+	close(ux_sock);
+
 	cli_exit();
 	free_polls();
 }
@@ -162,7 +177,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, void * trigger_data)
 		return NULL;
 	}
 
-	pthread_cleanup_push(uxsock_cleanup, NULL);
+	pthread_cleanup_push(uxsock_cleanup, (void *)ux_sock);
 
 	condlog(3, "uxsock: startup listener");
 	polls = (struct pollfd *)MALLOC((MIN_POLLS + 1) * sizeof(struct pollfd));
--
1.7.9.5

Thanks,
Chongyun Wu


-----original-----
sender: Martin Wilck [mailto:mwilck at suse.com] : 2018-1-14 at 3:50
receiver: wuchongyun (Cloud) <wu.chongyun at h3c.com>; dm-devel at redhat.com
cc: guozhonghua (Cloud) <guozhonghua at h3c.com>; gechangwei (Cloud) <ge.changwei at h3c.com>
subject: Re: [dm-devel] multipath-tools: release uxsocket and resource when cancel thread

On Wed, 2017-11-29 at 09:35 +0000, Wuchongyun wrote:
> Hi ,
> 
> Issue description: 
> when multipathd initilaze and call uxsock_listen to create unix domain 
> socket, but return -1 and the errno is 98 and then the uxsock_listen 
> return null. After multipathd startup we can't receive any multipathd 
> commands to finish the new multipath creation anymore!
> 
> We found that uxlsnr thread's cleanup function not close the sockets 
> and also not release the clients when cancel thread, the domain socket 
> will be release by the system. In any special environment like 
> themachine's load is very heavy, the system may not close the old 
> domain socket when we try to create and bind the domain socket may 
> return 98 (Address already in use).
> 
> And also we make some experiments:
> In uxsock_cleanup if we close the ux_sock first and then immdediately 
> call ux_socket_listen to create new ux_sock and initialization will be 
> OK; If we don't close the ux_sock and call ux_socket_listen will 
> return -1 and errno = 98.
> 
> So we believe that close uxsocket and release clients when cancel 
> thread might making new starting multipathd thread can create uxsocket 
> successfully, and might receiving multipathd commands properly.
> And also this path can fix clients' memory leak.

I think this is correct. But I have some remarks, see below.

> 
> Thanks,
> Chongyun
> 
> Signed-off-by: wu chongyun <wu.chongyun at h3c.com>
> ---
> multipathd/uxlsnr.c |   16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c index 
> 98ac25a..de6950b 100644
> --- a/multipathd/uxlsnr.c
> +++ b/multipathd/uxlsnr.c
> @@ -139,6 +139,20 @@ void check_timeout(struct timespec start_time, 
> char *inbuf,
> 
>  void uxsock_cleanup(void *arg)
> {
> +       struct client *client_loop;
> +       struct client *client_tmp;
> +       int ux_sock = (int)arg;
> +
> +       /*
> +       * no need to protect clients, 
> +       * because all operations to clients are only from one
> thread(uxlsnr)
> +       */
> +       list_for_each_entry_safe(client_loop, client_tmp, &clients,
> node) {
> +                dead_client(client_loop);

This takes and releases the client_lock in every loop iteration.
I'd rather take the lock, loop over all clients freeing the resources, and relase.

> +       }
> +
> +       close(ux_sock);
> +
>        cli_exit();
>        free_polls();
> }
> @@ -162,7 +176,7 @@ void * uxsock_listen(uxsock_trigger_fn 
> uxsock_trigger, void * trigger_data)
>                 return NULL;
>        }
> 
> -        pthread_cleanup_push(uxsock_cleanup, NULL);
> +       pthread_cleanup_push(uxsock_cleanup, (void*)ux_sock);

This patch has whitespace issues.

Regards,
Martin

--
Dr. Martin Wilck <mwilck at suse.com>, Tel. +49 (0)911 74053 2107 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg)





More information about the dm-devel mailing list