[dm-devel] [PATCH 0/4][RFC] dm-core: full barrier support
Mikulas Patocka
mpatocka at redhat.com
Sun Jan 11 19:28:29 UTC 2009
On Thu, 8 Jan 2009, Milan Broz wrote:
> This patchset is just resend for new kernel of a barrier implementation.
>
> Please note that it is more request for discussion...
>
> [PATCH 1/4] dm-core: remove current partial barrier implementation
>
> [PATCH 2/4] dm-core: Add zero-size barrier processing to device-mapper
>
> [PATCH 3/4] dm-core: Process barrier request with payload in device-mapper
>
> [PATCH 4/4] dm-core: Implement waiting for barrier in dm_suspend request
>
>
> Milan
> --
> mbroz at redhat.com
Hi
I looked at the patches.
There are few fixable technical bugs:
* After you submit a barrier, you wait for completion of the barrier and
preceding requests. Yet, in your implementation the barrier request may
be reordered with preceding requests and this is against barrier
specification. There need to be two waits, one before submitting the
barrier and one after.
* If you endio a barrier request, you end the request and then call
DM_WQ_BARRIER_POST that flushes caches. You must flush caches first and
then do bio_endio (so that when the barrier request finishes, data are
certainly on the disk).
* what about DM_ENDIO_REQUEUE and the pushback queue? Alasdair, please
describe what is it used for (there is usage of this in mpath and raid1).
With two queues it is impossible to get barrier ordering right, this needs
somehow to be changed to one-queue design --- biolists deferred and
pushback joined. It would be best to remove DM_ENDIO_REQUEUE, if it is
possible.
* if you clear DMF_BLOCK_IO, you don't run the deferred queue --- it would
produce lockup if barrier and non-barrier requests were submitted
simultaneously (the non-barrier request gets queued on the deferred queue
because of DMF_BLOCK_IO, but not dequeued when the barrier finishes).
Single filesystems usually don't do it --- but they are allowed to --- for
example to run direct io in parallel with journal commit. Another
possibility when this lockup happens is when the filesystem is mounted
read-wriet and the admin runs fsck in read-only mode.
* maybe the patch could be simplified somehow, not all the branches are
needed (for example join DM_WQ_BARRIER and DM_WQ_BARRIER_POST).
--- and the main issue with the patch:
The patch waits in dm_request routine until the barrier finishes. This is
allowed (there is no deadlock or crash hapenning because of it), but it
basically turns asynchronous IO into synchronous IO and degrades
performance.
As it turned out in my discussion before Christmas on lkml, all the
current in-kernel filesystems use barriers in a synchronous way, so there
is no problem with synchronous waiting inside dm_request now. But using
barriers in a synchronous way supresses any performance advantage barriers
could bring (you can replace synchronous barriers with hw-cache flushes
and get the same performance with simpler code). If someone ever
introduces async-barrier filesystem into kernel, synchronous processing
will kill performance.
So this synchronous waiting is not a performance problem now but it may be
problem in the future.
I have thought about it and written an implementation of asynchronous
barriers --- it works by passing the requests to kdmflush and it returns
immediatelly from dm_request. I'll post it in a few days.
Mikulas
More information about the dm-devel
mailing list