[dm-devel] [PATCH 4/5] multipathd: reset __delayed_reconfig from ev_add_map()

Wed Mar 16 20:13:55 UTC 2022

On Mon, Mar 14, 2022 at 10:30:35PM +0100, mwilck at suse.com wrote:
> From: Martin Wilck <mwilck at suse.com>
> 
> With the previous patch, one race condition between child() and
> the uevent handler (ev_add_map()) remains: 1. child() sets
> __delayed_reconfig, 2. ev_add_map() calls schedule_reconfigure() and
> thus DAEMON_CONFIGURE, 3. child() sets DAEMON_IDLE. This would cause
> the pending reconfigure request to be missed.
> 
> To fix this, reset __delayed_reconfig immediately from ev_add_map()
> (and likewise, missing_uev_wait_tick()). This way the wait loop in
> child() terminates on the reconfigure_pending condition.
> 
> Also, these schedule_reconfigure() callers don't really request a
> reconfigure() call; they just unblock processing previously requested
> reconfigure() calls. Add a new helper, unblock_reconfigure(), that
> does just that.
>

Doesn't unblock_reconfigure() allow us to set the state to
DAEMON_CONFIGURE, right after the checkerloop() has set it to
DAEMON_RUNNING. schedule_reconfigure() didn't update the state if it was
DAEMON_RUNNING. Instead, it would wait till checkerloop() set the state
back to DAEMON_IDLE for the reconfigure to start.

I have also come up with a different bug that effects these fixes,
including mine. But it will effect this fix worse. This fix assumes that
when a reconfigure is delayed, it should remain delayed until either a
change event happens on the multipath device (ev_add_map) or multipathd
times out waiting for one (missing_uev_wait_tick). However the map could
be removed before either of those things happen.  It's possible that
multipathd could get a remove uevent after the add event but before the
change event. Either that or a multipathd client command could remove
it, or a dm event could happen either removing the device, or updating
it, but with __setup_multipath() removing it. Those are just the
examples I came up with off the top of my head.

So there's a specific problem where we don't handle clearing
__delayed_reconfig if a map was removed. That can be handled in a
different patch.  But this, or any other oversight we might have here
can be mitigated by being more proactive in removing __delayed_reconfig.
For instance, when then main thread tries to reconfigure, it checks
need_to_delay_reconfig(), and only does the reconfigure if this returns
that it's ok.  __delayed_reconfig mostly exists so that the main thread
doesn't need to grab the vecs lock and loop through the the multipath
devices to know if it need to delay, but need_to_delay_reconfig() is
the definitive answer. When the main thread checks this, we should be
updating __delayed_reconfig to match the result. This patch removes the
"__delayed_reconfig = false;" line, which I think is a mistake.

Imagine if, because a map got removed before ev_add_map() was called,
__delayed_reconfig was true, even though there were no maps with
mpp->wait_for_udev set. Any existing config request would still be
delayed, but if another reconfigure was requested not only would it
happen, but if the "__delayed_reconfig = false;" line was back,
__delayed_reconfig would go back to the correct value.

-Ben

> Suggested-by: Benjamin Marzinski <bmarzins at redhat.com>
> Signed-off-by: Martin Wilck <mwilck at suse.com>
> ---
>  multipathd/main.c | 41 ++++++++++++++++++++---------------------
>  1 file changed, 20 insertions(+), 21 deletions(-)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index d033a9a..2424db7 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -155,16 +155,6 @@ int should_exit(void)
>  	return get_running_state() == DAEMON_SHUTDOWN;
>  }
>  
> -static bool get_delayed_reconfig(void)
> -{
> -	bool val;
> -
> -	pthread_mutex_lock(&config_lock);
> -	val = __delayed_reconfig;
> -	pthread_mutex_unlock(&config_lock);
> -	return val;
> -}
> -
>  /*
>   * global copy of vecs for use in sig handlers
>   */
> @@ -308,6 +298,22 @@ void post_config_state(enum daemon_status state)
>  	pthread_cleanup_pop(1);
>  }
>  
> +static bool unblock_reconfigure(void)
> +{
> +	bool was_delayed;
> +
> +	pthread_mutex_lock(&config_lock);
> +	was_delayed = __delayed_reconfig;
> +	if (was_delayed) {
> +		__delayed_reconfig = false;
> +		__post_config_state(DAEMON_CONFIGURE);
> +	}
> +	pthread_mutex_unlock(&config_lock);
> +	if (was_delayed)
> +		condlog(2, "reconfigure (delayed)");
> +	return was_delayed;
> +}
> +
>  void schedule_reconfigure(enum force_reload_types requested_type)
>  {
>  	pthread_mutex_lock(&config_lock);
> @@ -790,12 +796,9 @@ ev_add_map (char * dev, const char * alias, struct vectors * vecs)
>  		dm_get_info(mpp->alias, &mpp->dmi);
>  		if (mpp->wait_for_udev) {
>  			mpp->wait_for_udev = 0;
> -			if (get_delayed_reconfig() &&
> -			    !need_to_delay_reconfig(vecs)) {
> -				condlog(2, "reconfigure (delayed)");
> -				schedule_reconfigure(FORCE_RELOAD_WEAK);
> +			if (!need_to_delay_reconfig(vecs) &&
> +			    unblock_reconfigure())
>  				return 0;
> -			}
>  		}
>  		/*
>  		 * Not really an error -- we generate our own uevent
> @@ -1899,11 +1902,8 @@ missing_uev_wait_tick(struct vectors *vecs)
>  		}
>  	}
>  
> -	if (timed_out && get_delayed_reconfig() &&
> -	    !need_to_delay_reconfig(vecs)) {
> -		condlog(2, "reconfigure (delayed)");
> -		schedule_reconfigure(FORCE_RELOAD_WEAK);
> -	}
> +	if (timed_out && !need_to_delay_reconfig(vecs))
> +		unblock_reconfigure();
>  }
>  
>  static void
> @@ -3395,7 +3395,6 @@ child (__attribute__((unused)) void *param)
>  			reload_type = reconfigure_pending == FORCE_RELOAD_YES ?
>  				FORCE_RELOAD_YES : FORCE_RELOAD_WEAK;
>  			reconfigure_pending = FORCE_RELOAD_NONE;
> -			__delayed_reconfig = false;
>  			pthread_mutex_unlock(&config_lock);
>  
>  			rc = reconfigure(vecs, reload_type);
> -- 
> 2.35.1