[dm-devel] Questions around multipath failover and no_path_retry

Mon Mar 19 22:48:43 UTC 2018

On Sun, 2018-03-11 at 16:47 +0000, Karan Vohra wrote:
> Hi Folks,
> 
> Let us assume, there are 2 paths within the path group which dm-
> multipath is sending the I/Os in round-robin fashion. Each of these
> paths are identified as unique block device(s) such as /dev/sdb and
> /dev/sdc. 
> 
> Let us say some I/Os are sent over to the path /dev/sdb and either
> the requests time out or there is a failure on that path, what
> happens to those I/Os? Are they sent over to the other path -
> /dev/sdc or does dm-multipath waits for /dev/sdb to come back online
> and only sends I/O to /dev/sdb?

The I/O is sent to otherr paths (sdc in your example) when the lower
layer (e.g. SCSI) indicates path failure for sdb. That's the very point
of multipathing.

>  One of the reasons we are concerned about the above scenario is- let
> us say there is a write I/O W1 which is routed to /dev/sdb and then
> there is a failure. There was a write I/O W2 which wrote at the same
> block via /dev/sdc. Now if multipath sends W1 through /dev/sdc, W2
> gets overwritten by W1. The expectation was that W2 happens after W1
> and should overwrite W1 but the result is opposite. 

If you send two write IOs to the same sector at the same time, you
can't be sure which one arrives first. That's not specific to
multipath. If you want to guarantee ordering, you have to flush W1 
using e.g. fdatasync() before sending W2. The flush command won't
return before W1 is written to disk.

> Situations like these can cause data inconsistency and corruption.We
> were thinking of using no_path_retry configuration to be set to queue
> to make sure that the I/Os supposed to be going to path1 never make
> it to path2.

That won't work. As the name of the option suggeests, "no_path_retry"
only affects the behavior if there's _no_ healthy path left.

>  But the question is that would not that cause unexpected behavior in
> application layer? Let us say there are I/O Requests R1, R2, R3 and
> so on.. R1 is going to Path1, R2 is going to Path2 and so on. If
> Path1 dies for some reason, with the setting of no_path_retry to
> queue, queueing will not stop until the path is fixed so does not
> that mean that R1, R3,R5 ... will not make it to block device until
> the path is fixed? Would it not cause failures if the issue persists
> for seconds? 

As I said, that's not how it works. R1->P1, R2->P2, ... only holds as
long as all paths are up (and no IO scheduler is active which might re-
order your I/O requests).

> What about the size of queue? Is there any danger of queue getting
> overloaded?Any pointers or references would be of great help.

Theoretically, the queue is only limited by memory size.

Martin

> 
> Thanks!
> Karan
> 
> 
> Get Outlook for Android
> 
> --
> dm-devel mailing list
> dm-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

-- 
Dr. Martin Wilck <mwilck at suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)