[dm-devel] [PATCH v2] libmultipath: is_path_valid(): check if device is in use

Benjamin Marzinski bmarzins at redhat.com
Tue Dec 13 17:50:49 UTC 2022


On Tue, Dec 13, 2022 at 08:46:28AM +0100, Martin Wilck wrote:
> On Tue, 2022-12-06 at 18:37 -0600, Benjamin Marzinski wrote:
> > On Sat, Dec 03, 2022 at 12:43:38AM +0100, mwilck at suse.com wrote:
> > > From: Martin Wilck <mwilck at suse.com>
> > > 
> > > To check whether we will be able to add a given device can be part
> > > of a multipath map, we have two tests in check_path_valid():
> > > released_to_systemd() and the O_EXCL test. The former isn't helpful
> > > if "multipath -u" is called for the first time for a given device,
> > > and the latter is only used in the "find_multipaths smart" case,
> > > because
> > > actively opening the device with O_EXCL, even for a very short
> > > time, is prone
> > > to races with other processes.
> > > 
> > > It turns out that this may cause issues in some scenarios. We saw
> > > problems in
> > > once case where "find_multipaths greedy" was used with a single
> > > non-multipahted root disk and a very large number of multipath
> > > LUNs.
> > > The root disk would first be classified as multipath device.
> > > multipathd
> > > would try to create a map, fail (because the disk was mounted) and
> > > trigger another uevent. But because of the very large number of
> > > multipath
> > > devices, this event was queued up behind thousands of other events,
> > > and
> > > the root device timed out eventually.
> > > 
> > > While a simple workaround for the given problem would be proper
> > > blacklisting
> > > or using a different find_multipaths mode, I am proposing a
> > > different
> > > solution here. An additional test is added in is_path_valid() which
> > > checks whether the given device is currently in use by 1. sysfs
> > > holders,
> > > 2. mounts (from /proc/self/mountinfo) or 3. swaps (from
> > > /proc/swaps). 2.
> > > and 3. are similar to systemd's device detection after switching
> > > root.
> > > This must not only be done for the device itself, but also for all
> > > its
> > > partitions. For mountinfo and swaps, libmount is utilized.
> > > 
> > > With this patch, "multipath -u" will make devices with mounted or
> > > otherwise
> > > used partitions available to systemd early, without waiting for
> > > multipathd
> > > to fail setting up the map and re-triggering an uevent. This should
> > > avoid
> > > the issue described above even without blacklisting. The downside
> > > of it
> > > is a longer runtime of "multipath -u" for almost all devices, in
> > > particular
> > > for real multipath devices. The runtime required for the new checks
> > > was in the
> > > order of 0.1ms-1ms in my tests. Moreover, there is a certain risk
> > > that devices may
> > > wrongly classified as non-multipath because of transient mounts or
> > > holders
> > > created by other processes.
> > > 
> > > To make this code compile on older distributions, we need some
> > > additional
> > > checks in create-config.mk.
> > > 
> > Two small issues below.
> > 
> > > Signed-off-by: Martin Wilck <mwilck at suse.com>
> > > ---
> > >  .github/workflows/build-and-unittest.yaml |   2 +-
> > >  create-config.mk                          |  11 +-
> > >  libmpathutil/libmpathutil.version         |   6 +
> > >  libmpathutil/util.c                       |  12 +
> > >  libmpathutil/util.h                       |   2 +
> > >  libmultipath/Makefile                     |   2 +-
> > >  libmultipath/alias.c                      |  11 -
> > >  libmultipath/valid.c                      | 267
> > > ++++++++++++++++++++++
> > >  tests/Makefile                            |   2 +-
> > >  tests/valid.c                             |  48 ++++
> > >  10 files changed, 348 insertions(+), 15 deletions(-)
> > > 
> > > 
> > > +static int read_partitions(const char *syspath, vector parts)
> > > +{
> > > +       struct scandir_result sr = { .n = 0 };
> > > +       char path[PATH_MAX], *last;
> > > +       char *prop;
> > > +       int i;
> > > +
> > > +       strlcpy(path, syspath, sizeof(path));
> > > +       sr.n = scandir(path, &sr.di, subdir_filter, NULL);
> > > +       if (sr.n == -1)
> > > +               return -errno;
> > > +
> > > +       pthread_cleanup_push_cast(free_scandir_result, &sr);
> > > +
> > > +       /* parts[0] is the whole disk */
> > > +       if (vector_alloc_slot(parts) &&
> > > +           (prop = strdup(strrchr(path, '/') + 1)) != NULL)
> > > +               vector_set_slot(parts, prop);
> > 
> > We should add the whole disk like we do for the partitions below,
> > where
> > we strdup() the name first, and free it if we can't allocate a slot.
> > Otherwise, if the strdup failed, we could leave an empty slot in the
> > vector, which would cause vector_foreach_slot() to stop early
> 
> Sure.
> 
> 
> > >  int
> > >  is_path_valid(const char *name, struct config *conf, struct path
> > > *pp,
> > >               bool check_multipathd)
> > >  {
> > >         int r;
> > >         int fd;
> > > +       const char *prop;
> > >  
> > >         if (!pp || !name || !conf)
> > >                 return PATH_IS_ERROR;
> > > @@ -80,6 +338,10 @@ is_path_valid(const char *name, struct config
> > > *conf, struct path *pp,
> > >         if (!pp->udev)
> > >                 return PATH_IS_ERROR;
> > >  
> > > +       prop = udev_device_get_property_value(pp->udev, "DEVTYPE");
> > > +       if (prop == NULL || strcmp(prop, "disk"))
> > > +               return PATH_IS_NOT_VALID;
> > > +
> > >         r = pathinfo(pp, conf, DI_SYSFS | DI_WWID | DI_BLACKLIST);
> > >         if (r == PATHINFO_SKIPPED)
> > >                 return PATH_IS_NOT_VALID;
> > > @@ -96,6 +358,11 @@ is_path_valid(const char *name, struct config
> > > *conf, struct path *pp,
> > >                 return PATH_IS_ERROR;
> > >         }
> > >  
> > > +       if ((conf->find_multipaths == FIND_MULTIPATHS_GREEDY ||
> > > +            conf->find_multipaths == FIND_MULTIPATHS_SMART) &&
> > > +           is_device_in_use(pp->udev) > 0)
> > > +               return PATH_IS_NOT_VALID;
> > > +
> > 
> > Even if multipath is set to "smart", the common case will still be
> > that
> > the device is in the wwids file, so we know that it's valid. I don't
> > see
> > a reason to do this check in that case.  I was thinking that for the
> > "smart" case, this check should go at the end of the function, before
> > we
> > return PATH_IS_MAYBE_VALID. Or do you have a reason why we should do
> > it
> > here that I'm missing?
> 
> I'd argue the other way around: if the device is in use, it doesn't
> matter any more whether or not it's listed in the WWIDs file. This
> situation can only occur if something went wrong beforehand (most
> probably, an inconsistent WWIDs file in the initrd, or some other
> component misbehaving). If we tag such a device VALID although it's
> mounted, we'll almost certainly fall into emergency mode, regardless of
> the find_multipaths mode we're in. The reason is that multipathd will
> fail to map the mounted device, and systemd will consider the path
> device unusable, leading to the device being inaccessible either way.

Sure. We can leave it like this. I don't have strong opinions one way or
the other.

-Ben
 
> I agree that the device being in use while listed in the WWIDs file is
> an error and should probably be logged more prominently.
> 
> Martin


More information about the dm-devel mailing list