[dm-devel] [PATCH/rfc] dm: revise 'rescue' strategy for md->bs allocations
NeilBrown
neilb at suse.com
Mon Sep 4 00:12:20 UTC 2017
On Fri, Sep 01 2017, Mikulas Patocka wrote:
> On Fri, 1 Sep 2017, NeilBrown wrote:
>
>> On Thu, Aug 31 2017, Mikulas Patocka wrote:
>>
>> >>
>> >> Note that only current->bio_list[0] is offloaded. current->bio_list[1]
>> >> contains bios that were scheduled *before* the current one started, so
>> >
>> > These bios need to be offloaded too, otherwise you re-introduce this
>> > deadlock: https://www.redhat.com/archives/dm-devel/2014-May/msg00089.html
>>
>> Thanks for the link.
>> In the example the bio that is stuck was created in step 4. The call
>> to generic_make_request() will have placed it on current->bio_list[0].
>> The top-level generic_make_request call by Process A is still running,
>> so nothing will have moved the bio to ->bio_list[1]. That only happens
>> after the ->make_request_fn completes, which must be after step 7.
>> So the problem bio is on ->bio_list[0] and the code in my patch will
>> pass it to a workqueue for handling.
>>
>> So I don't think the deadlock would be reintroduced. Can you see
>> something that I am missing?
>>
>> Thanks,
>> NeilBrown
>
> Offloading only current->bio_list[0] will work in a simple case described
> above, but not in the general case.
>
> For example, suppose that we have a dm device where the first part is
> linear and the second part is snapshot.
>
> * We receive bio A that crosses the linear/snapshot boundary
> * DM core creates bio B, submits it to the linear target and adds it to
> current->bio_list[0]
> * DM core creates bio C, submits it to the snapshot target, the snapshot
> target calls track_chunk for this bio and appends the bio to
> current->bio_list[0]
> * Now, we return back to generic_make_request
> * We pop bio B from current->bio_list[0]
> * We execute bio_list_on_stack[1] = bio_list_on_stack[0], that moves bio C
> to bio_list_on_stack[1] - and now we lose any possibility to offload bio C
> to the rescue thread
> * The kcopyd thread for the snapshot takes the snapshot lock and waits for
> bio C to finish
Thanks for the explanation.
I cannot find this last step in the code. "waits for BIO C to finish"
is presumably the called to __check_for_conflicting_io(). This is
called from kcopyd in merge_callback() -> snapshot_merge_next_chunks().
What lock is held at that time?
> * We process bio B - and if processing bio B reaches something that takes
> the snapshot lock (for example an origin target for the snapshot), a
> deadlock will happen. The deadlock could be avoided by offloading bio C to
> the rescue thread, but bio C is already on bio_list_on_stack[1] and so it
> won't be offloaded
I think the core issue behind this deadlock is that the same volume
can appear twice in the top-level device, in different regions. An
outstanding request to one region can then interfere with a new request
to a different region. That seems like a reasonable thing to do.
I cannot immediately see a generic way to handle this case that I am
happy with. I'll keep hunting.
Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20170904/05f212a0/attachment.sig>
More information about the dm-devel
mailing list