[dm-devel] DM-Multipath path failure questions..
Mike Christie
michaelc at cs.wisc.edu
Wed Nov 14 17:28:37 UTC 2007
Michael Vallaly wrote:
> Hello,
>
> I am currently using the dm-multipather (multipath-tools) to allow high-availability / increased capacity to our Equallogic iSCSI SAN. I was wondering if anyone had come across a way to re-instantiate a failed path / paths from a multipath target, when the backend device (iscsi initiator) goes away.
>
> All goes well until we have a lengthy network hiccup or non-recoverable iSCSI error in which case the multipather seems to get wedged. The path seems to get stuck in a [active][faulty] state and the backend block device (sdX) actually gets removed from the system. I have tried reconnecting the iSCSI session, after this happens, and get a new (different IE: sdg vs. sdf) backend block level device, but the multipather never picks it up / never resumes IO operations, and I generally have then to power cycle the box.
>
> We have anywhere from 2 to 4 iSCSI sessions open per multipath target, but even one path failing seems to cause the whole multipath to die. I am hoping there is a way to continue on after a path failure, rather than the power cycle. I have tried multipath-tools 0.4.6/0.4.7/0.4.8, and almost every permutation of the configuration I can think of. Maybe I am missing something quite obvious.
>
I was wondering what you are doing on the target to cause the device/sdX
to be removed or what error you get? Normally that only happens if you
run the iscsiadm logout command, or if the target is sends the initiator
a error indicating that is going away for good, or there is some other
error like the CHAP values changed on the target. And in older versions
of open-iscsi there is a bug where it kills the session and removes sdXs
a little early on errors that should be recoverable (We found the bug in
865-* but this is fixed in the open-iscsi git tree and will be fixed in
the new release), so I just want to make sure I got all the recoverable
errors.
What kernel are you using, and what happens when you reconnect the
session and get a new sdX if you run the multipath command by hand?
More information about the dm-devel
mailing list