[dm-devel] [PATCH v4 16/20] multipath -u : set FIND_MULTIPATHS_WAIT_UNTIL from /dev/shm
Benjamin Marzinski
bmarzins at redhat.com
Thu Apr 12 21:35:03 UTC 2018
On Thu, Apr 12, 2018 at 10:35:02PM +0200, Martin Wilck wrote:
> On Thu, 2018-04-12 at 13:36 -0500, Benjamin Marzinski wrote:
> > On Wed, Apr 04, 2018 at 06:16:23PM +0200, Martin Wilck wrote:
> > > In "find_multipaths smart" mode, use time stamps under
> > > /dev/shm/multipath/find_multipaths to track waiting for multipath
> > > siblings. When a path is first encountered and is "maybe"
> > > multipath, create a
> > > file under /dev/shm, set its modification time to the expiry time
> > > of the
> > > timer, and set the FIND_MULTIPATHS_WAIT_UNTIL variable. On later
> > > calls, also set
> > > FIND_MULTIPATHS_WAIT_UNTIL to the expiry time (but don't change the
> > > time
> > > stamp) if it's not expired yet, or 0 if it is expired. Set
> > > FIND_MULTIPATHS_WAIT_UNTIL even if enough evidence becomes
> > > available to decide
> > > if the path needs to be multipathed - this enables the udev rules
> > > to detect
> > > that this is a device a timer has been started for, and stop it. By
> > > using
> > > /dev/shm, we share information about "smart" timers between initrd
> > > and root
> > > file system, and thus only calculate the timeout once.
> > >
> > > Signed-off-by: Martin Wilck <mwilck at suse.com>
> > > ---
> > > libmultipath/file.c | 2 +-
> > > libmultipath/file.h | 1 +
> > > multipath/main.c | 133
> > > ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > 3 files changed, 135 insertions(+), 1 deletion(-)
> > >
> > > +
> > > +/**
> > > + * find_multipaths_check_timeout(wwid, tmo)
> > > + * Helper for "find_multipaths smart"
> > > + *
> > > + * @param[in] pp: path to check / record
> > > + * @param[in] tmo: configured timeout for this WWID, or value <= 0
> > > for checking
> > > + * @param[out] until: timestamp until we must wait,
> > > CLOCK_REALTIME, if return
> > > + * value is FIND_MULTIPATHS_WAITING
> > > + * @returns: FIND_MULTIPATHS_WAIT_DONE, if waiting has finished
> > > + * @returns: FIND_MULTIPATHS_ERROR, if internal error occured
> > > + * @returns: FIND_MULTIPATHS_NEVER, if tmo is 0 and we didn't wait
> > > for this
> > > + * device
> > > + * @returns: FIND_MULTIPATHS_WAITING, if timeout hasn't expired
> > > + */
> > > +static int find_multipaths_check_timeout(const struct path *pp,
> > > long tmo,
> > > + struct timespec *until)
> > > +{
> > > + char path[PATH_MAX];
> > > + struct timespec now, ftimes[2], tdiff;
> > > + struct stat st;
> > > + long fd;
> > > + int r, err, retries = 0;
> > > +
> > > + clock_gettime(CLOCK_REALTIME, &now);
> > > +
> >
> > I'm worried about using pp->dev_t here with no method of removing
> > these
> > files. What happens when a path device, say 8:32 is removed and a
> > completely new device comes in reusing the same dev_t?
>
> Hm, what else should we use? devnode name is even worse, and most other
> options are a can of worms... In the worst case, the new device
> wouldn't be waited for (or not long enough), because the marker existed
> before it was detected.
>
> I could simply add a rule that removes the marker in case of a "remove"
> uevent, ok?
Fair enough. Nothing really bad happens, and removing the markers on
remove uevents would solve it.
> >
> > > + if (snprintf(path, sizeof(path), "%s/%s", shm_find_mp_dir,
> > > pp->dev_t)
> > > + >= sizeof(path)) {
> > > + condlog(1, "%s: path name overflow", __func__);
> >
> > shouldn't this be:
> > return FIND_MULTIPATHS_ERROR;
>
> Bah, thank for catching it.
>
> > > static int print_cmd_valid(int k, const vector pathvec,
> > > struct config *conf)
> > > {
> > > static const int vals[] = { 1, 0, 2 };
> > > + int wait = FIND_MULTIPATHS_NEVER;
> > > + struct timespec until;
> > > + struct path *pp;
> > >
> > > if (k < 0 || k >= sizeof(vals))
> > > return 1;
> > >
> > > + if (k == 2) {
> > > + /*
> > > + * Caller ensures that pathvec[0] is the path to
> > > + * examine.
> > > + */
> > > + pp = VECTOR_SLOT(pathvec, 0);
> > > + select_find_multipaths_timeout(conf, pp);
> > > + wait = find_multipaths_check_timeout(
> > > + pp, pp->find_multipaths_timeout, &until);
> > > + if (wait != FIND_MULTIPATHS_WAITING)
> > > + k = 1;
> > > + } else if (pathvec != NULL) {
> > > + pp = VECTOR_SLOT(pathvec, 0);
> > > + wait = find_multipaths_check_timeout(pp, 0,
> > > &until);
> > > + }
> > > + if (wait == FIND_MULTIPATHS_WAITING)
> > > + printf("FIND_MULTIPATHS_WAIT_UNTIL=\"%ld.%06ld\"\n
> > > ",
> > > + until.tv_sec, until.tv_nsec/1000);
> > > + else if (wait == FIND_MULTIPATHS_WAIT_DONE)
> > > + printf("FIND_MULTIPATHS_WAIT_UNTIL=\"0\"\n");
> >
> > If we get an error trying to check the timeout, should we just keep
> > FIND_MULTIPATHS_WAIT_UNTIL the same? Or would it be better to set it
> > to
> > 0, and fail the smart claiming?
>
> The latter is what we do. We set k=1 if find_multipaths_check_timeout()
> returns anything but FIND_MULTIPATHS_WAITING, resulting in
> DM_MULTIPATH_DEVICE_PATH="0".
Oops. I got confused here.
-Ben
> Regards
> Martin
>
> --
> Dr. Martin Wilck <mwilck at suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
More information about the dm-devel
mailing list