[dm-devel] [PATCH 9/9] dm path selector: Avoid that device removal triggers an infinite loop

Bart Van Assche bart.vanassche at sandisk.com
Thu Sep 1 15:22:30 UTC 2016


On 09/01/2016 08:06 AM, Mike Snitzer wrote:
> On Thu, Sep 01 2016 at 10:14am -0400,
> Bart Van Assche <Bart.VanAssche at sandisk.com> wrote:
>
>> On 08/31/16 20:29, Mike Snitzer wrote:
>>> On Wed, Aug 31 2016 at  6:18pm -0400,
>>> Bart Van Assche <bart.vanassche at sandisk.com> wrote:
>>>
>>>> If pg_init_retries is set and a request is queued against a
>>>> multipath device with all underlying block devices in the "dying"
>>>> state then an infinite loop is triggered because activate_path()
>>>> never succeeds and hence never calls pg_init_done(). Fix this by
>>>> making ql_select_path() skip dying paths.
>>>
>>> Assuming DM multipath needs to be sprinkling these dying queue checks so
>>> deep (which I'm not yet sold on):
>>>
>>> Same would be needed in service-time and round-robin right?
>>
>> Hello Mike,
>>
>> Before addressing service-time and round-robin path selectors I wanted
>> to make sure that we reach agreement about how to fix the queue length
>> path selector.
>>
>> Do you have a proposal for an alternative approach to fix the infinite
>> loop that can be triggered during device removal?
>
> I'm going to look closer now.  But I'd prefer to see the "dying" state
> check(s) elevated to DM multipath.  Really would rather the path
> selectors not have to worry about this state.

Hello Mike,

How about making blk_cleanup_queue() invoke a callback function in dm or 
dm-mpath and to use that callback function to keep track of the number 
of paths that are not in the "dying" state? That would allow to detect 
in the dm or dm-mpath driver whether or not all paths are in the dying 
state without having to modify every path selector. This is just an idea 
- there might be better alternatives.

Bart.




More information about the dm-devel mailing list