[dm-devel] RFC for multipath queue_if_no_path timeout.

Hannes Reinecke hare at suse.de
Fri Sep 27 08:06:14 UTC 2013


On 09/27/2013 08:07 AM, Hannes Reinecke wrote:
> On 09/27/2013 01:49 AM, Mike Snitzer wrote:
>> On Thu, Sep 26 2013 at  7:22pm -0400,
>> Alasdair G Kergon <agk at redhat.com> wrote:
>>
>>> On Thu, Sep 26, 2013 at 10:47:13AM -0700, Frank Mayhar wrote:
>>>> Launching it from ramdisk won't help, particularly, since it still goes
>>>> through the block layer.  The other stuff won't help if a (potentially
>>>> unrelated) bug in the daemon happens to be being tickled at the same
>>>> time, or if some dependency happens to be broken and _that's_ what's
>>>> preventing the daemon from making progress.
>>>  
>>> Then put more effort into debugging your daemon so it doesn't have
>>> bugs that make it die?  Implement the timeout in a robust independent
>>> daemon if it's other code there that's unreliable?
>>>
>>>> And as far as lvm2 and multipath-tools, yeah, they cope okay in the kind
>>>> of environments most people have, but that's not the kind of environment
>>>> (or scale) we have to deal with.
>>>
>>> In what way are your requirements so different that a locked-into-memory
>>> monitoring daemon cannot implement this timeout?
>>
>> Frank, I had a look at your patch.  It leaves a lot to be desired, I was
>> starting to clean it up but ultimately found myself agreeing with
>> Alasdair's original point: that this policy should be implemented in the
>> userspace daemon.
>>
> _Actually_ there is a way how this could be implemented properly:
> implement a blk_timeout function.
> 
> Thing is, every request_queue might have a timeout function
> implemented, whose goal is to abort requests which are beyond that
> timeout. EG SCSI uses that for the dev_loss_tmo mechanism.
> 
> Multipath what with it being request-based could easily implement
> the same mechanism, namely have to blk_timeout function which would
> just re-arm the timeout in the default case, but abort any queued
> I/O (after a timeout) if all paths are down.
> 
> Hmm. I see to draft up a PoC.
> 
And indeed, here it is.

Completely untested, just to give you an idea what I was going on
about. Let's see if I can put this to test somewhere...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare at suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dm-mpath-no-path-timeout.patch
Type: text/x-patch
Size: 4589 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20130927/6cc1a4e5/attachment.bin>


More information about the dm-devel mailing list