[dm-devel] nvme: allow retry for requests with REQ_FAILFAST_TRANSPORT set

Mike Snitzer snitzer at redhat.com
Thu Apr 15 23:11:26 UTC 2021


BZ: 1948690
Upstream Status: RHEL-only

Signed-off-by: Mike Snitzer <snitzer at redhat.com>

rhel-8.git commit 7dadadb072515f243868e6fe2f7e9c97fd3516c9
Author: Mike Snitzer <snitzer at redhat.com>
Date:   Tue Aug 25 21:52:48 2020 -0400

    [nvme] nvme: allow retry for requests with REQ_FAILFAST_TRANSPORT set
    
    Message-id: <20200825215248.2291-11-snitzer at redhat.com>
    Patchwork-id: 325180
    Patchwork-instance: patchwork
    O-Subject: [RHEL8.3 PATCH 10/10] nvme: allow retry for requests with REQ_FAILFAST_TRANSPORT set
    Bugzilla: 1843515
    RH-Acked-by: David Milburn <dmilburn at redhat.com>
    RH-Acked-by: Gopal Tiwari <gtiwari at redhat.com>
    RH-Acked-by: Ewan Milne <emilne at redhat.com>
    
    BZ: 1843515
    Upstream Status: RHEL-only
    
    Based on patch that was proposed upstream but ultimately rejected, see:
    https://www.spinics.net/lists/linux-block/msg57490.html
    
    I'd have made this change even if this wasn't already posted obviously,
    but I figured I'd give proper attribution due to their public post with
    the same code change.
    
    Author: Chao Leng <lengchao at huawei.com>
    Date:   Wed Aug 12 16:18:55 2020 +0800
    
        nvme: allow retry for requests with REQ_FAILFAST_TRANSPORT set
    
        REQ_FAILFAST_TRANSPORT may be designed for SCSI, because SCSI protocol
        does not define the local retry mechanism. SCSI implements a fuzzy
        local retry mechanism, so REQ_FAILFAST_TRANSPORT is needed to allow
        higher-level multipathing software to perform failover/retry.
    
        NVMe is different with SCSI about this. It defines a local retry
        mechanism and path error codes, so NVMe should retry local for non
        path error.  If path related error, whether to retry and how to retry
        is still determined by higher-level multipathing's failover.
    
        Unlike SCSI, NVMe shouldn't prevent retry if REQ_FAILFAST_TRANSPORT
        because NVMe's local retry is needed -- as is NVMe specific logic to
        categorize whether an error is path related.
    
        In this way, the mechanism of NVMe multipath or other multipath are
        now equivalent.  The mechanism is: non path related error will be
        retry local, path related error is handled by multipath.
    
        Signed-off-by: Chao Leng <lengchao at huawei.com>
        [snitzer: edited header for grammar and to make clearer]
        Signed-off-by: Mike Snitzer <snitzer at redhat.com>
    
    Signed-off-by: Mike Snitzer <snitzer at redhat.com>
    Signed-off-by: Frantisek Hrbata <fhrbata at redhat.com>

---
 drivers/nvme/host/core.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Index: linux-rhel9/drivers/nvme/host/core.c
===================================================================
--- linux-rhel9.orig/drivers/nvme/host/core.c
+++ linux-rhel9/drivers/nvme/host/core.c
@@ -306,7 +306,14 @@ static inline enum nvme_disposition nvme
 	if (likely(nvme_req(req)->status == 0))
 		return COMPLETE;
 
-	if (blk_noretry_request(req) ||
+	/*
+	 * REQ_FAILFAST_TRANSPORT is set by upper layer software that
+	 * handles multipathing. Unlike SCSI, NVMe's error handling was
+	 * specifically designed to handle local retry for non-path errors.
+	 * As such, allow NVMe's local retry mechanism to be used for
+	 * requests marked with REQ_FAILFAST_TRANSPORT.
+	 */
+	if ((req->cmd_flags & (REQ_FAILFAST_DEV | REQ_FAILFAST_DRIVER)) ||
 	    (nvme_req(req)->status & NVME_SC_DNR) ||
 	    nvme_req(req)->retries >= nvme_max_retries)
 		return COMPLETE;




More information about the dm-devel mailing list