[dm-devel] [PATCH] multipathd: avoid unnecessary path read-only reloads

Benjamin Marzinski bmarzins at redhat.com
Tue Nov 23 15:57:01 UTC 2021


On Tue, Nov 23, 2021 at 11:05:20AM +0000, Martin Wilck wrote:
> Hi Ben,
> 
> some more thoughts about the ro handling.
> 
> On Mon, 2021-11-22 at 20:39 +0100, Martin Wilck wrote:
> > On Mon, 2021-11-22 at 11:43 -0600, Benjamin Marzinski wrote:
> > > On Mon, Nov 22, 2021 at 04:48:06PM +0000, Martin Wilck wrote:
> > > > On Mon, 2021-11-22 at 09:35 -0600, Benjamin Marzinski wrote:
> > > > > On Fri, Nov 19, 2021 at 09:33:39PM +0000, Martin Wilck wrote:
> > > > > > On Thu, 2021-11-18 at 16:47 -0600, Benjamin Marzinski wrote:
> > > > > > > 
> > > > > > > +static bool
> > > > > > > +needs_ro_update(struct multipath *mpp, int ro)
> > > > > > > +{
> > > > > > > +       struct pathgroup * pgp;
> > > > > > > +       struct path * pp;
> > > > > > > +       unsigned int i, j;
> > > > > > > +       struct dm_info *dmi = NULL;
> > > > > > > +
> > > > > > > +       if (!mpp || ro < 0)
> > > > > > > +               return false;
> > > > > > > +       dm_get_info(mpp->alias, &dmi);
> > > > > > 
> > > > > > Why can't you just use mpp->dmi here?
> > > > > 
> > > > > Since that value is set when the dmi is originally created, I
> > > > > didn't
> > > > > want to not reload a map, if we simply haven't updated it yet
> > > > > to
> > > > > reflect
> > > > > a change in the read-only value, like with do with
> > > > > dm_is_suspended()
> > > > > or dm_get_deferred_remove(), etc. I could make a
> > > > > dm_get_read_only()
> > > > > function and put it libmultipath/devmapper.c like the others,
> > > > > if
> > > > > you'd
> > > > > rather.
> > > > 
> > > > I had expected that this property wouldn't silently change under
> > > > us.
> > > > Actually, I do think that we should get an uevent if this
> > > > happens.
> > > > Not sure if we process it properly, though.
> > > 
> > > I think we will update the dmi, 
> 
> This would need to be done on a change uevent for the dm device in
> ev_add_map(), but AFAICS we don't. ev_add_map() is basically a noop if
> the map is already known, unless wait_for_udev is 2.
> 
> > > but I'm not sure that these uevents
> > > won't race. The worry was that the device would switch to read-only
> > > and
> > > then back too quickly, and we would get this event and still see
> > > the
> > > device in read/write because we haven't processed the event which
> > > would
> > > update the multipath dmi.
> > 
> > OK. I'm fine with the patch, perhaps explain these subtleties some
> > more
> > in the commit message for future reference.
> 
> I've never looked into the ro attribute processing closely. I just did.
> I'm unsure how a race would come to pass, in particular with your patch
> applied:
> 
>  1. path change uevent arrives
>  2. ro attribute of path device has changed
>  3. map reload occurs if 
>     a) map was rw before (thus all paths, too) and the path changed to
> ro
>     b) map was ro before and all paths have changed to rw
>  4. kernel will call set_disk_ro() depending on the DM_READONLY_FLAG;
>     set_disk_ro() triggers an uevent for the block device if and only
>     if the ro flag changed
>  5. we set mpp->dmi in __setup_multipath().
> 
> We hold the vecs lock between 3 and 5, so even if the uevent arrived
> before setup_multipath() was called, I don't see how it could race.
> mpp->dmi as derived in 5 should reflect the state correctly.

I admit, I didn't find a definitive race. I was just worried about the
possibility of the dmi being outdated. While there's always the
possibility of the multipath device's RO state getting changed outside
of multipathd (by a multipath call for example), this is not the place
to deal with that. ev_add_map() would be. After looking at this, I'm
o.k. with trusting the existing dmi, especially if we are updating it in
ev_add_map().
 
> What we could do is remember the ro-state of the map in dm_addmap(),
> e.g. in a mpp->ro field. If map creation with ro=0 succeeded, we can be
> pretty certain that the map is in read-write state. Otherwise we'd
> fallback to ro=1, and remember that state, too. We could verify that
> state once more against the dmi info in setup_multipath(). By doing
> that we'd cover the time span between the dm ioctl and retrieving the
> dmi in setup_multipath(). That would IMHO be more consistent than the
> current use of the temporary force_readonly flag.

So the idea would be to never try reloading read-write when the map is
marked as RO, until we get a path event updating the RO state? I do
worry about cases where well fail to reload the map correctly then.
Imagine that we have a map with mpp->ro=1 with one read-only path. The
read-only path gets removed. If we just assume that the mpp->ro state is
correct until with get a path_event changing the read-only state, we
will won't reload read/write here.  The other option would be to check
the path's RO state every time we reload, or at least whenever we're
reloading to remove a path. That has the advantage that it doesn't
produce a dm error message like a failed reload does, but I'm not sure
if it's any less work. Or am I misunderstanding what you are suggesting
here?

> I've been wondering whether we should use your logic during map
> creation, too, and not even try to setup the map with ro=0 if we have
> paths in read-only state.

If sysfs says one of the paths is read-only, it seems reasonable to skip
the read/write reload.

-Ben
 
> Regards
> Martin
> 




More information about the dm-devel mailing list