[dm-devel] Re: possible regression by the barrier patch in 2.6.30-rc2

Mikulas Patocka mpatocka at redhat.com
Thu Nov 5 01:53:23 UTC 2009



On Wed, 28 Oct 2009, Alasdair G Kergon wrote:

> Well let's go back to first principles.
> 
> There are two types of suspend.
> 
> (1) I don't care about the ordering of the I/O on the disk relative to
> the suspend.  This one is easy: it's --noflush --nolockfs.
> 
> (2) I do want some control over the state of the device at the point
> of the suspend.
> 
> Break this second case down.
> If I have a filesystem, I require it to be consistent, so I require lockfs.
> 
> If the device belongs to a userspace database, I require it to be consistent,
> so I must pause the database, at which point there is no further I/O being
> issued to the device, then I suspend (and everything prior to this must be
> flushed) and resume etc.  This can be either a "suspend with flush" OR the
> flush could have been issued prior to the suspend (and any decent database
> would have done that).

In this database case, you have to pause the database, the pause procedure 
will wait until all I/O finishes and won't submit new I/O. When the pause 
procedure finishes, the database has no I/O in flight, so it doesn't 
matter if you use flush or noflush suspend.

The reason is that there may be another I/O midlayers between the database 
and the device mapper. So, if the database submits I/O, it doesn't have to 
immediatelly arrive to the device mapper. If you paused the database 
(without waiting for complete I/Os) and then issued "flush" suspend, the 
I/O may still be pending somewhere above the suspended device, then the 
device finishes flush suspend, then the I/O arrives and waits until 
unsuspend.

I'm somehow starting to think that "flush" suspend is not needed at all 
and all suspends may be "noflush". Do you have any counterexamples?

Mikulas

> If I have any other application owning the device that cares about the
> state on the device at the point of the suspend it *must* stop issuing
> I/O while the suspend ioctl is run.  If any I/O continues to arrive
> during the suspend, then that tells us we are necessarily in case (1).
> 
> So the implementation of 'suspend with flush' is simply 'issue a flush
> using the standard mechanism for doing this and then issue a suspend'.
> The 'suspend' itself does not need any special handling for 'flush'.
> In practice, anything that requires a flush will issue it prior to
> calling the suspend ioctl and not send I/O concurrently with the
> suspend.  For backwards compatibility we could still support the
> 'suspend with flush' as I described - issue a flush internally before
> entering the code that performs the suspend.
> 
> Alasdair
> 




More information about the dm-devel mailing list