[dm-devel] [PATCH 2/2] multipath: fix rcu thread cancellation hang

Martin Wilck mwilck at suse.com
Fri Mar 23 21:09:23 UTC 2018


On Fr, 2018-03-23 at 15:00 -0500, Benjamin Marzinski wrote:
> While the rcu code is waiting for a grace period to elapse, no
> threads
> can register or unregister as rcu reader threads. If for some reason,
> a
> thread never calls put_multipath_config() to exit a read side
> critical
> section, then any threads trying to start or stop will hang. This can
> happen if a thread is cancelled between calls to
> get_multipath_config()
> and put_multipath_config(), and multipathd is reconfigured (which
> causes
> the rcu code to wait for a grace period).
> 
> This patch fixes this issue in two ways. Where possible, it reorders
> the
> code or saves config values into local variables to remove
> cancellation
> points between calls to get_multipath_config() and
> put_multipath_config().  In cases where this isn't possible (or where
> it
> would cause a significant amount of extra work to be done) multipath
> now
> pushes a cleanup handler to call put_multipath_config().
> 
> The only functions that were not modified were ones that were only
> called by multipath or mpathpersist, since these are single threaded
> and already disable rcu thread registration.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins at redhat.com>

Kudos for doing this meticulous work!

Reviewed-by: Martin Wilck <mwilck at suse.com>

(I admit my review wasn't in depth. I fully ack the idea of the patch, 
and I scanned through it without spotting obvious errors. I did not
check whether you should have changed more code as you already did).

Here's a suggestion, as I think this is getting pretty ugly (not your
fault). Maybe we should rename get_multipath_config() to
__get_multipath_config() and do something like

#define begin_with_config(conf) \
    __get_multipath_config(conf); \
    pthread_cleanup_push(__put_multipath_config, conf); \
    do

#define end_with_config(conf) \
    while(0); \
    pthread_cleanup_pop(1)

... and require that all code blocks accessing the configuration should
be coded like this:

begin_with_config(conf) {
    ... CODE ...
} end_with_config(conf);

IMO that'd improve readability and reduce likelihood of errors.

As you're touching so many lines of code anyway, that wouldn't be that
much harder :-/

Regards,
Martin


-- 
Dr. Martin Wilck <mwilck at suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)




More information about the dm-devel mailing list