[dm-devel] [PATCH] multipathd: "san_path_err" failure optimization

Chongyun Wu wu.chongyun at h3c.com
Tue Aug 27 12:28:14 UTC 2019


Hi Martin and Ben,

Cloud you help to view below patch, thanks.

>From a7126e33e7eff8a985600b41b1723ee66b183586 Mon Sep 17 00:00:00 2001
From: Chongyun Wu <wu.chongyun at h3c.com>
Date: Tue, 27 Aug 2019 10:23:50 +0800
Subject: [PATCH] multipathd: "san_path_err" failure optimization

Let san_path_err_recovery_time path unstable can be
detected and not reinstate it until this path keep up in
san_path_err_recovery_time. It will fix heavy IO delay
caused by parts of paths state shaky in multipath device.

Test and result:
Run up eth1 30s and down eth1 30s with 100 loops script to
make some paths shaky in each multipath devices.
Using below multipath.conf configure in defaults section:
    san_path_err_recovery_time 30
    san_path_err_threshold 2
    san_path_err_forget_rate 6
After test, not found any IO delay logs except several logs in the very
beginning which before san_path_err filter shaky path works .
If without above config and this patch there will be lots of IO delay
in syslog and some paths state change from up to down again and again.

Signed-off-by: Chongyun Wu <wu.chongyun at h3c.com>
---
 multipathd/main.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/multipathd/main.c b/multipathd/main.c
index 7a5cd11..8acd080 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1897,6 +1897,18 @@ static int check_path_reinstate_state(struct path * pp) {
 			condlog(2, "%s : reinstating path early", pp->dev);
 			goto reinstate_path;
 		}
+
+		/* If path became failed again or continue failed, should reset
+		 * path san_path_err_forget_rate and path dis_reinstate_time to
+		 * start a new stable check. 
+		 */
+		if ((pp->state != PATH_UP) && (pp->state != PATH_GHOST) &&
+			(pp->state != PATH_DELAYED)) {
+			pp->san_path_err_forget_rate =
+				pp->mpp->san_path_err_forget_rate;
+			pp->dis_reinstate_time = curr_time.tv_sec;
+		}
+
 		if ((curr_time.tv_sec - pp->dis_reinstate_time ) > pp->mpp->san_path_err_recovery_time) {
 			condlog(2,"%s : reinstate the path after err recovery time", pp->dev);
 			goto reinstate_path;
@@ -2106,6 +2118,11 @@ check_path (struct vectors * vecs, struct path * pp, int ticks)
 			check_path_reinstate_state(pp)) {
 		pp->state = PATH_DELAYED;
 		return 1;
+	} else if ((newstate != PATH_UP && newstate != PATH_GHOST) &&
+			(pp->state == PATH_DELAYED)) {
+		/* If path state become failed again cancel path delay state */
+		pp->state = newstate;
+		return 1;
 	}
 
 	if ((newstate == PATH_UP || newstate == PATH_GHOST) &&
-- 

Best Regards,
Chongyun Wu





More information about the dm-devel mailing list