[dm-devel] Re: [PATCH] crash in dm-io when signal is pending
Mikulas Patocka
mpatocka at redhat.com
Tue Jan 27 20:50:57 UTC 2009
On Fri, 23 Jan 2009, Mikulas Patocka wrote:
> On Fri, 23 Jan 2009, Alasdair G Kergon wrote:
>
> > On Wed, Jan 21, 2009 at 10:04:39PM -0500, Mikulas Patocka wrote:
> > > If someone sends signal to a process performing synchronous dm-io call,
> > > the kernel may crash.
> >
> > > There is no way to cancel in-progress IOs, so the best solution is to ignore
> > > signals at this point.
> >
> > So what is the impact of this patch at a higher level?
>
> Avoid crash if the admin kills lvm or dmsetup with SIGKILL at a certain
> point.
>
> AFAIK lvm blocks all the blockable signals while it is performing critical
> operations, so there should be no crash from pressing ^C, terminal loss or
> so.
>
> > - What userspace operations are there that you can interrupt now, but that
> > after this patch you won't be able to?
>
> When I grepped for interruptible sleep, I found one another possibility:
> aborting a suspend with signal. I didn't find crash condition that could
> be caused by this, but it could unfortunatelly confuse targets.
>
> If suspend is aborted this way, presuspend method is called, but
> postsuspend, preresume and resume isn't --- this will confuse target
> drivers --- you end up with an active mirror that stopped recovering or
> active snapshot that stopped merging.
>
> I don't know if aborting suspend this way should be allowed or not.
>
> > (Are there any situations where the io will not complete without a reboot,
> > that could actually be safe today?)
>
> If the io will not complete, you can't reboot with normal reboot script.
> Unmount/remount-ro waits for ios on a filesystem to complete, so they will
> deadlock.
>
> Mikulas
>
> > Alasdair
> > --
> > agk at redhat.com
Regarding the other possibilities you suggested on the phone call:
There are main architectural design decisions that can't be changed
without major code rewrite:
- submitted bio can't be cancelled
- the device must not be closed when some bios are submitted on it
So, if the function sync_io() was modified so that the structure would be
allocated with kmalloc and wouldn't be on stack (so that the function
could be interrupted and exit while bios are still pending), we would have
to somehow make sure be that the device wouldn't be closed until all the
bios finish.
So there would be no benefit for the user --- the user still wouldn't be
able to interrupt target contructor --- because in the error path, devices
are closed and these close calls would have to wait for the interrupted
bio to finish. And there would be major code blow coming from the fact
that these interrupted bios would have to be tracked somehow and
dm_put_device would have to wait for them.
So the best solution is to make dm-io ios uninterruptible.
Mikulas
More information about the dm-devel
mailing list