[dm-devel] [PATCH v7 0/2] multipath-tools: intermittent IO error accounting to improve reliability
Guan Junxiong
guanjunxiong at huawei.com
Mon Nov 6 13:04:32 UTC 2017
On 2017/11/6 20:15, Muneendra Kumar M wrote:
> Hi Guan,
> Any update on this patch ?
>> Regards,
> Muneendra.
>
It's not yet merged. It's waiting for Christophe's merging.
Hope Christophe can give any feedback soon.
BTW, your clients ( and my clients) can keep using this
patch until it is really merged into the mainline.
Please wait . I think Christophe will eventually pick up this patch.
Best wishes.
Guan
> -----Original Message-----
> From: Guan Junxiong [mailto:guanjunxiong at huawei.com]
> Sent: Thursday, November 02, 2017 6:20 AM
> To: christophe.varoqui at opensvc.com
> Cc: dm-devel at redhat.com; Muneendra Kumar M <mmandala at Brocade.com>; mwilck at suse.com; shenhong09 at huawei.com; niuhaoxin at huawei.com
> Subject: Re: [PATCH v7 0/2] multipath-tools: intermittent IO error accounting to improve reliability
>
> Dear Christophe,
>
> Could you please consider applying this patch or give any feedback about it?
> We (Huawei and Brocade) are looking forward to you reply.
> Thanks.
>
> Regards
> Guan Junxiong
>
> .
>
>
> On 2017/10/24 9:57, Guan Junxiong wrote:
>> Hi Christophe and All,
>>
>> This patch set adds a new method of path state checking based on
>> accounting IO error. This is useful in many scenarios such as
>> intermittent IO error on a path due to intermittent frame drops,
>> intermittent corruptions, network congestion or a shaky link.
>>
>> This patch set is of significance because of this (quoted from the
>> discussion with Muneendra, Brocade):
>>
>> There are typically two type of SAN network problems that are
>> categorized as marginal issues. These issues by nature are not
>> permanent in time and do come and go away over time.
>> 1) Switches in the SAN can have intermittent frame drops or intermittent
>> frame corruptions due to bad optics cable (SFP) or any such wear/tear port
>> issues. This causes ITL flows that go through the faulty switch/port to
>> intermittently experience frame drops.
>> 2) There exists SAN topologies where there are switch ports in the fabric
>> that becomes the only conduit for many different ITL(host--target--LUN)
>> flows across multiple hosts. These single network paths are essentially
>> shared across multiple ITL flows. Under these conditions if the port link
>> bandwidth is not able to handle the net sum of the shared ITL flows bandwidth
>> going through the single path then we could see intermittent network
>> congestion problems. This condition is called network oversubscription.
>> The intermittent congestions can delay SCSI exchange completion time
>> (increase in I/O latency is observed).
>>
>> To overcome the above network issues and many more such target issues,
>> there are frame level retries that are done in HBA device firmware and
>> I/O retries in the SCSI layer. These retries might succeed because of two reasons:
>> 1) The intermittent switch/port issue is not observed
>> 2) The retry I/O is a new SCSI exchange. This SCSI exchange can take an
>> alternate SAN path for the ITL flow, if such an SAN path exists.
>> 3) Network congestion disappears momentarily because the net I/O bandwidth
>> coming from multiple ITL flows on the single shared network path is
>> something the path can handle
>>
>> However in some cases we have seen I/O retries don't succeed because
>> the retry I/Os hits a SAN network path that has intermittent
>> switch/port issue and/or network congestion.
>>
>> On the host thus we see configurations two or more ITL path sharing
>> the same target/LUN going through two or more HBA ports. These HBA
>> ports are connected to two or more SAN to the same target/LUN.
>> If the I/O fails at the multipath layer then, the ITL path is turned
>> into Failed state. Because of the marginal nature of the network, the
>> next Health Check command sent from multipath layer might succeed,
>> which results in making the ITL path into Active state. You end up
>> seeing the DM path state going into Active, Failed, Active
>> transitions. This results in overall reduction in application I/O
>> throughput and sometime application I/O failures (because of timing
>> constraints). All this can happen because of I/O retries and I/O
>> request moving across multiple paths of the DM device. In the host it
>> is to be noted all I/O retries on a single path and I/O movement
>> across multiple paths results in slowing down the forward progress of
>> new application I/O. Reason behind, the above I/O re-queue actions are given higher priority than the newer I/O requests coming from the application.
>>
>> The above condition of the ITL path is hence called "marginal".
>>
>> What we desire is for the DM to deterministically categorize a ITL
>> Path as “marginal” and move all the pending I/Os from the marginal
>> Path to an Active Path. This will help in meeting application I/O
>> timing constraints. Also a capability to automatically re-instantiate
>> the marginal path into Active once the marginal condition in the network is fixed.
>>
>>
>> Here is the description of implementation:
>> 1) PATCH 1/2 implements the algorithm that sends a couple of
>> continuous IOs to a path which suffers two failed events in less than
>> a given time. Those IOs are sent at a fix rate of 10 Hz.
>> 2) PATCH 2/2 discard the original algorithm because of this:
>> the detect sample interval of that path checkers is so big/coarse that
>> it doesn't see what happens in the middle of the sample interval. We
>> have the PATCH 1/2 as a better method.
>>
>>
>> Changes from V6:
>> * fix the warning of unwrapped commit description in patch 1/2
>> * add Reviewed-by tag of Muneendra
>> * add detailed scenario discription in the cover letter
>>
>> Changes from V5:
>> * rebase on the latest release 0.7.3
>>
>>
>> Changes from V4:
>> * path_io_err_XXX -> marginal_path_err_XXX. (Mumeendra)
>> * add one more parameters named marginal_path_double_failed_time instead
>> of the fixed 60 seconds for the pre-checking of a shaky path.
>> (Martin)
>> * fix for "reschedule checking after %d seconds" log
>> * path_io_err_recovery_time -> marginal_path_err_recheck_gap_time.
>> * put the marginal path into PATH_SHAKY instead of PATH_DELAYED
>> * Modify the commit comments to sync with the changes above.
>>
>>
>> Changes from V3:
>> * add a patch for discard the san_path_XXX_feature
>> * fail the path in the kernel before enqueueing the path for checking
>> rather than after knowing the checking result to make it more
>> reliable. (Martin)
>> * use posix_memalign instead of manual alignment for direct IO buffer.
>> (Martin)
>> * use PATH_MAX to avoid certain compiler warning when opening file
>> rather than FILE_NAME_SIZE. (Martin)
>> * discard unnecessary sanity check when getting block size (Martin)
>> * do not return 0 in send_each_aync_io if io_starttime of a path is
>> not set(Martin)
>> * Wait 10ms instead of 60 second if every path is down. (Martin)
>> * rename handle_async_io_timeout to poll_async_io_timeout and use polling
>> method because io_getevents does not return 0 if there are timeout IO
>> and normal IO.
>> * rename hit_io_err_recover_time ro hit_io_err_recheck_time
>> * modify the multipath.conf.5 and commit comments to keep sync with the
>> above changes
>>
>>
>> Changes from V2:
>> * fix uncondistional rescedule forverver
>> * use script/checkpatch.pl in Linux to cleanup informal coding style
>> * fix "continous" and "internel" typos
>>
>>
>> Changes from V1:
>> * send continous IO instead of a single IO in a sample interval
>> (Martin)
>> * when recover time expires, we reschedule the checking process
>> (Hannes)
>> * Use the error rate threshold as a permillage instead of IO
>> number(Martin)
>> * Use a common io_context for libaio for all paths (Martin)
>> * Other small fixes (Martin)
>>
>>
>> Junxiong Guan (2):
>> multipath-tools: intermittent IO error accounting to improve
>> reliability
>> multipath-tools: discard san_path_err_XXX feature
>>
>> libmultipath/Makefile | 5 +-
>> libmultipath/config.c | 3 -
>> libmultipath/config.h | 21 +-
>> libmultipath/configure.c | 7 +-
>> libmultipath/dict.c | 88 +++---
>> libmultipath/io_err_stat.c | 744
>> +++++++++++++++++++++++++++++++++++++++++++++
>> libmultipath/io_err_stat.h | 15 +
>> libmultipath/propsel.c | 70 +++--
>> libmultipath/propsel.h | 7 +-
>> libmultipath/structs.h | 15 +-
>> libmultipath/uevent.c | 32 ++
>> libmultipath/uevent.h | 2 +
>> multipath/multipath.conf.5 | 89 ++++--
>> multipathd/main.c | 140 ++++-----
>> 14 files changed, 1043 insertions(+), 195 deletions(-) create mode
>> 100644 libmultipath/io_err_stat.c create mode 100644
>> libmultipath/io_err_stat.h
>>
>
More information about the dm-devel
mailing list