[dm-devel] nvme: allow retry for requests with REQ_FAILFAST_TRANSPORT set
Mike Snitzer
snitzer at redhat.com
Thu Apr 15 23:11:26 UTC 2021
BZ: 1948690
Upstream Status: RHEL-only
Signed-off-by: Mike Snitzer <snitzer at redhat.com>
rhel-8.git commit 7dadadb072515f243868e6fe2f7e9c97fd3516c9
Author: Mike Snitzer <snitzer at redhat.com>
Date: Tue Aug 25 21:52:48 2020 -0400
[nvme] nvme: allow retry for requests with REQ_FAILFAST_TRANSPORT set
Message-id: <20200825215248.2291-11-snitzer at redhat.com>
Patchwork-id: 325180
Patchwork-instance: patchwork
O-Subject: [RHEL8.3 PATCH 10/10] nvme: allow retry for requests with REQ_FAILFAST_TRANSPORT set
Bugzilla: 1843515
RH-Acked-by: David Milburn <dmilburn at redhat.com>
RH-Acked-by: Gopal Tiwari <gtiwari at redhat.com>
RH-Acked-by: Ewan Milne <emilne at redhat.com>
BZ: 1843515
Upstream Status: RHEL-only
Based on patch that was proposed upstream but ultimately rejected, see:
https://www.spinics.net/lists/linux-block/msg57490.html
I'd have made this change even if this wasn't already posted obviously,
but I figured I'd give proper attribution due to their public post with
the same code change.
Author: Chao Leng <lengchao at huawei.com>
Date: Wed Aug 12 16:18:55 2020 +0800
nvme: allow retry for requests with REQ_FAILFAST_TRANSPORT set
REQ_FAILFAST_TRANSPORT may be designed for SCSI, because SCSI protocol
does not define the local retry mechanism. SCSI implements a fuzzy
local retry mechanism, so REQ_FAILFAST_TRANSPORT is needed to allow
higher-level multipathing software to perform failover/retry.
NVMe is different with SCSI about this. It defines a local retry
mechanism and path error codes, so NVMe should retry local for non
path error. If path related error, whether to retry and how to retry
is still determined by higher-level multipathing's failover.
Unlike SCSI, NVMe shouldn't prevent retry if REQ_FAILFAST_TRANSPORT
because NVMe's local retry is needed -- as is NVMe specific logic to
categorize whether an error is path related.
In this way, the mechanism of NVMe multipath or other multipath are
now equivalent. The mechanism is: non path related error will be
retry local, path related error is handled by multipath.
Signed-off-by: Chao Leng <lengchao at huawei.com>
[snitzer: edited header for grammar and to make clearer]
Signed-off-by: Mike Snitzer <snitzer at redhat.com>
Signed-off-by: Mike Snitzer <snitzer at redhat.com>
Signed-off-by: Frantisek Hrbata <fhrbata at redhat.com>
---
drivers/nvme/host/core.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
Index: linux-rhel9/drivers/nvme/host/core.c
===================================================================
--- linux-rhel9.orig/drivers/nvme/host/core.c
+++ linux-rhel9/drivers/nvme/host/core.c
@@ -306,7 +306,14 @@ static inline enum nvme_disposition nvme
if (likely(nvme_req(req)->status == 0))
return COMPLETE;
- if (blk_noretry_request(req) ||
+ /*
+ * REQ_FAILFAST_TRANSPORT is set by upper layer software that
+ * handles multipathing. Unlike SCSI, NVMe's error handling was
+ * specifically designed to handle local retry for non-path errors.
+ * As such, allow NVMe's local retry mechanism to be used for
+ * requests marked with REQ_FAILFAST_TRANSPORT.
+ */
+ if ((req->cmd_flags & (REQ_FAILFAST_DEV | REQ_FAILFAST_DRIVER)) ||
(nvme_req(req)->status & NVME_SC_DNR) ||
nvme_req(req)->retries >= nvme_max_retries)
return COMPLETE;
More information about the dm-devel
mailing list