[dm-devel] [PATCH 2/7] multipathd: fix check_path errors with removed map

Benjamin Marzinski bmarzins at redhat.com
Fri Jun 19 16:52:36 UTC 2020


On Fri, Jun 19, 2020 at 01:42:47PM +0000, Martin Wilck wrote:
> On Thu, 2020-06-18 at 18:17 -0500, Benjamin Marzinski wrote:
> > On Thu, Jun 18, 2020 at 07:34:38PM +0000, Martin Wilck wrote:
> > > On Wed, 2020-06-17 at 19:24 -0500, Benjamin Marzinski wrote:
> > > > 
> > > >  static void
> > > > @@ -2088,8 +2084,13 @@ check_path (struct vectors * vecs, struct
> > > > path
> > > > * pp, unsigned int ticks)
> > > >  	 * Synchronize with kernel state
> > > >  	 */
> > > >  	if (update_multipath_strings(pp->mpp, vecs->pathvec, 1)) {
> > > > +		struct dm_info info;
> > > >  		condlog(1, "%s: Could not synchronize with kernel
> > > > state",
> > > >  			pp->dev);
> > > > +		if (pp->mpp && pp->mpp->alias &&
> > > > +		    do_dm_get_info(pp->mpp->alias, &info) == 0)
> > > > +			/* multipath device missing. Likely removed */
> > > > +			return 0;
> > > >  		pp->dmstate = PSTATE_UNDEF;
> > > 
> > > It would be more elegant if we could distinguish different failure
> > > modes from update_multipath_strings() directly, without having to
> > > call
> > > do_dm_get_info() again.
> > 
> > Seems reasonable. I'll take a look at that.
> 
> Yet another idea: I just discussed this with Hannes, and he pointed out
> that right below this code, we have
> 
> 	/* if update_multipath_strings orphaned the path, quit early */
> 	if (!pp->mpp)
> 		return 0;
> 
> ... which could have the same effect, if pp->mpp was reloaded. Probably
> that doesn't happen because the pp->mpp dereference is cached in a
> register, but we could simply add a READ_ONCE there.

When update_multipath_strings() calls update_multipath_table() it will
fail because the table no longer exists.  If we differentiate between
a dm error and not finding the map, we can exit early without having to
call do_dm_get_info() again. But right now, if the map is gone, but we
haven't got the uevent removing it, then nothing will clear pp->mpp. If
we did get the uevent, then it must have grabbed the vecs lock. That
better have caused a memory barrier, which will guarantee that
check_path() sees the updated value.

-Ben
 
> Choose what you prefer.
> 
> Regards,
> Martin
> 
> -- 
> Dr. Martin Wilck <mwilck at suse.com>, Tel. +49 (0)911 74053 2107
> SUSE  Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix
> Imendörffer
> 




More information about the dm-devel mailing list