[dm-devel] [PATCH 0/7] Fix muitpath/multipathd flush issue

Martin Wilck Martin.Wilck at suse.com
Thu Jun 18 20:06:53 UTC 2020


On Thu, 2020-06-18 at 13:04 -0500, Benjamin Marzinski wrote:
> On Thu, Jun 18, 2020 at 09:00:49AM +0000, Martin Wilck wrote:
> > 
> > With queue_if_no_paths, there could be outstanding IO even if the
> > opencount was 0.
> 
> This is what I don't accept at face value. I wrote a little test
> program that fired off an async read, closed the fd without waiting
> for
> a response, and then tried to exit.  running this on a queueing
> multipath device causes the exit to hang like this
> 
>  cat /proc/22655/task/22655/stack
> [<0>] exit_aio+0xdc/0xf0
> [<0>] mmput+0x33/0x130
> [<0>] do_exit+0x27f/0xb30
> [<0>] do_group_exit+0x3a/0xa0
> [<0>] __x64_sys_exit_group+0x14/0x20
> [<0>] do_syscall_64+0x5b/0x160
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [<0>] 0xffffffffffffffff
> 
> and the multipath device to remain in use
> 
> # dmsetup info mpathb
> Name:              mpathb
> State:             ACTIVE
> Read Ahead:        256
> Tables present:    LIVE
> Open count:        0
> Event number:      7
> Major, minor:      253, 5
> Number of targets: 1
> UUID: mpath-3600d0230000000000e13954ed5f89301
> 

The open count is 0. Wouldn't this be exactly the situation that
Hannes' patch was targeting - opencount 0, but still unable to
flush/close because of outstanding IO? If multipath was trying to flush
the map in this situation, it would disable queueing. Your process
would get an IO error sooner or later, and exit. But depending on the
amount of outstanding IO and the weather, this might not happen before
multipath had attempted to flush the map, and the flush attempt would
fail. By inserting the synchronous flush operation after disabling
queueing, multipath avoids that failure. Am I misunderstanding
something?


> I've asked around, and device-mapper is certainly supposed to flush
> all
> IO before the last close. Of course, there may be a corner case where
> it
> doesn't. If you know of one, that would be worth sharing.
> 
> What I think that flush previously accomplished is that, just like
> the
> flush_on_last_del code, if you stop queueing and error out the IO,
> then
> whatever is waiting on it will get those errors, and likely close the
> device soon after, hopefully in time for multipath to remove the
> device.
> 
> However, since multipath now checks if the device is in use before
> disabling queue_if_no_path, it don't think this code is actually
> helpful
> right now.
> 
> > This IO must be flushed somehow before the map can be
> > removed. Apparently just setting queue_if_no_path = 0 was not
> > enough,
> > that's why Hannes added the suspend/resume. Maybe today we have
> > some
> > better way to force the flush? libdm has the _error_device()
> > function 
> > (aka dmsetup wipe_table) that replaces the map's table by the error
> > target. But this does a map reload, which implies a suspend, too.
> > Perhaps simply an fsync() on the dm device, or just wait a while
> > until
> > all outstanding IO has errored out? But these alternatives don't
> > actually look better to me than Hannes' suspend/resume, they will
> > take
> > at least as much time. Do you have a better idea?
> 
> To go back to the old behavior, we would need to move the check if
> the
> device is in use until after we suspended the device. Or we can keep
> the
> current behavior, and simply remove the flushing and suspending.
> 

AFAICS the "in use" check looks at the opencount, and according to your
output above, it can be 0 while IO is outstanding. I haven't checked
this myself yet. But I can confirm that there was an actual bug (map
removal failing) that was allegedly fixed by the suspend/resume. It was
long ago, I can't tell with confidence if it could still happen.

Don't get me wrong, I'm not generally opposed to the removal of the
flush/suspend. I just wanted to clarify why it was there. It has been
used in multipath only, anyway. If we delegate to multipathd, it makes
sense to handle the situation in multipathd's manner. If we want to
make sure the user experience for "multipath -f" users is unchanged, we
could handle failure accordingly (suspend, resume, and retry flushing
the map).

Best,
Martin

-- 
Dr. Martin Wilck <mwilck at suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer






More information about the dm-devel mailing list