[dm-devel] [PATCH V4 0/2] multipath-tools: intermittent IO error accounting to improve reliability
Guan Junxiong
guanjunxiong at huawei.com
Sun Sep 17 03:40:36 UTC 2017
Hi ALL,
This patchset add a new method of path state checking based on accounting
IO error. This is useful in many scenarios such as intermittent IO error
an a path due to network congestion, or a shaky link.
PATCH 1/2 implements the algorithm that sends a couple of continuous IOs
at a fix rate of 10 Hz.
PATCH 2/2 discard the original algorithm because of this:
the detect sample interval of that path checkers is so big/coarse that
it doesn't see what happens in the middle of the sample interval. We have
the PATCH 1/2 as a better method.
Changes from V3:
* discard the
* fail the path in the kernel before enqueueing the path for checking
rather than after knowing the checking result to make it more
reliable. (Martin)
* use posix_memalign instead of manual alignment for direct IO buffer. (Martin)
* use PATH_MAX to avoid certain compiler warning when opening file
rather than FILE_NAME_SIZE. (Martin)
* discard unnecessary sanity check when getting block size (Martin)
* do not return 0 in send_each_aync_io if io_starttime of a path is
not set(Martin)
* Wait 10ms instead of 60 second if every path is down. (Martin)
* rename handle_async_io_timeout to poll_async_io_timeout and use polling
method because io_getevents does not return 0 if there are timeout IO
and normal IO.
* rename hit_io_err_recover_time ro hit_io_err_recheck_time
* modify the multipath.conf.5 and commit comments to keep sync with the
above changes
Changes from V2:
* fix uncondistional rescedule forverver
* use script/checkpatch.pl in Linux to cleanup informal coding style
* fix "continous" and "internel" typos
Changes from V1:
* send continous IO instead of a single IO in a sample interval (Martin)
* when recover time expires, we reschedule the checking process (Hannes)
* Use the error rate threshold as a permillage instead of IO number(Martin)
* Use a common io_context for libaio for all paths (Martin)
* Other small fixes (Martin)
Junxiong Guan (2):
multipath-tools: intermittent IO error accounting to improve
reliability
multipath-tools: discard san_path_err_XXX feature
libmultipath/Makefile | 5 +-
libmultipath/config.c | 3 -
libmultipath/config.h | 18 +-
libmultipath/configure.c | 6 +-
libmultipath/dict.c | 74 ++---
libmultipath/io_err_stat.c | 743 +++++++++++++++++++++++++++++++++++++++++++++
libmultipath/io_err_stat.h | 15 +
libmultipath/propsel.c | 54 ++--
libmultipath/propsel.h | 6 +-
libmultipath/structs.h | 14 +-
libmultipath/uevent.c | 32 ++
libmultipath/uevent.h | 2 +
multipath/multipath.conf.5 | 62 ++--
multipathd/main.c | 130 ++++----
14 files changed, 971 insertions(+), 193 deletions(-)
create mode 100644 libmultipath/io_err_stat.c
create mode 100644 libmultipath/io_err_stat.h
--
2.11.1
More information about the dm-devel
mailing list