[dm-devel] [RFC PATCH 0/4] Make prio timeouts work like checkers

Benjamin Marzinski bmarzins at redhat.com
Wed Aug 30 18:44:16 UTC 2023


On Tue, Aug 29, 2023 at 08:34:39PM +0000, Martin Wilck wrote:
> On Fri, 2023-07-28 at 14:05 -0500, Benjamin Marzinski wrote:
> > This patchset changes how prioritizers set their timeouts, to make
> > them
> > match how the checker functions work, and also cleans up some minor
> > timeout issues. I did this to make out timeouts consistent, but if
> > someone has a good reason to object to that, I don't feel
> > strongly that it's necessary, and I can resend just the bugfixes.
> > 
> 
> I don't object the idea, quite the contrary. But I would prefer a
> different solution.
> 
> IMO we should treat the "io_timeout" as a path property, and add a
> field  in "struct path" for it. It would be initialized from 
> conf->checker_timeout, or if that's unset, from sysfs, and finally,
> DEF_TIMEOUT. By reading the sysfs value, we'd be able to accomodate
> different properties for different devices, but we'd not re-read this
> value repeatedly like we're doing now. IMO that would be more
> consistent with what we do for other device properties.
> 
> We currently pass the timeout value down the call stack all the way
> from pathinfo() like this (for alua):
> 
> pathinfo()
>   get_prio()
>      prio_getprio()
>        p->getprio()
>          get_alua_info()
>             get_target_port_group()
>                do_inquiry() (*)
>                  do_inquiry_sg()
>             get_asymmetric_access_state()
>                do_rtpg() (*)
>                
> With the exception of the functions marked by (*), all these functions
> obtain a "struct path" argument, too. IIUC, the main reason we're doing
> this is to avoid stalled getprio() calls for paths that are down
> (bb935d4 ("libmultipath: change failed path prio timeout")).
> 
> IMO it would make more sense to remove the "timeout" arguments from
> these functions, and just determine the timeout where it's needed. I
> don't think that's a layering violation; functions that receive a
> "struct path" can also handle PATH_DOWN. Thus we could write
> 
> int get_prio_timeout_ms(const struct path *pp) 
> {
>       if (pp->state == PATH_DOWN)
>             return 10;
>       else
>             return pp->io_timeout * 1000;
> }
> 
> and use this function as far down the stack as we can.
> 
> Furthermore, to improve code readability and avoid issues like in 3/4
> and 4, I think we should call all variables and fields that take
> millisecond values "timeout_ms".
> 
> Thoughts?

Sure. I can rework this.

-Ben

> 
> Martin


More information about the dm-devel mailing list