[dm-devel] [PATCH 0/4] Fix order when split bio and send remaining back to itself

dannyshih dannyshih at synology.com
Tue Dec 29 09:18:38 UTC 2020


From: Danny Shih <dannyshih at synology.com>

We found out that split bios might handle not in order when a big bio
had split by blk_queue_split() and also split in stacking block device,
such as md device because chunk size boundary limit.

Stacking block device normally use submit_bio_noacct() add the remaining
bio to current->bio_list's tail after they split original bio. Therefore,
when bio split first time, the last part of bio was add to bio_list.
After then, when bio split second time, the middle part of bio was add to
bio_list. Results that the middle part is now behind the last part of bio.

For example:
	There is a RAID0 md device, with max_sectors_kb = 2 KB,
	and chunk_size = 1 KB

	1. a read bio come to md device wants to read 0-7 KB
	2. In blk_queue_split(), bio split into (0-1), (2-7),
	   and send (2-7) back to md device

	   current->bio_list = bio_list_on_stack[0]: (md 2-7)
	3. RAID0 split bio (0-1) into (0) and (1), since chunk size is 1 KB
	   and send (1) back to md device

	   bio_list_on_stack[0]: (md 2-7) -> (md 1)
	4. remap and send (0) to lower layer device

	   bio_list_on_stack[0]: (md 2-7) -> (md 1) -> (lower 0)
	5. __submit_bio_noacct() sorting bio let lower bio handle firstly
	   bio_list_on_stack[0]: (lower 0) -> (md 2-7) -> (md 1)
	   pop (lower 0)
	   move bio_list_on_stack[0] to bio_list_on_stack[1]

	   bio_list_on_stack[1]: (md 2-7) -> (md 1)
	6. after handle lower bio, it handle (md 2-7) firstly, and split
	   in blk_queue_split() into (2-3), (4-7), send (4-7) back

	   bio_list_on_stack[0]: (md 4-7)
	   bio_list_on_stack[1]: (md 1)
	7. RAID0 split bio (2-3) into (2) and (3) and send (3) back

	   bio_list_on_stack[0]: (md 4-7) -> (md 3)
	   bio_list_on_stack[1]: (md 1)
	...
	In the end, the split bio handle's order will become
	0 -> 2 -> 4 -> 6 -> 7 -> 5 -> 3 -> 1

Reverse the order of same queue bio when sorting bio in
__submit_bio_noacct() can solve this issue, but it might influence
too much. So we provide alternative version of submit_bio_noacct(),
named submit_bio_noacct_add_head(), for the case which need to add bio
to the head of current->bio_list. And replace submit_bio_noacct() with
submit_bio_noacct_add_head() in block device layer when we want to
split bio and send remaining back to itself.

Danny Shih (4):
  block: introduce submit_bio_noacct_add_head
  block: use submit_bio_noacct_add_head for split bio sending back
  dm: use submit_bio_noacct_add_head for split bio sending back
  md: use submit_bio_noacct_add_head for split bio sending back

 block/blk-core.c       | 44 +++++++++++++++++++++++++++++++++-----------
 block/blk-merge.c      |  2 +-
 block/bounce.c         |  2 +-
 drivers/md/dm.c        |  2 +-
 drivers/md/md-linear.c |  2 +-
 drivers/md/raid0.c     |  4 ++--
 drivers/md/raid1.c     |  4 ++--
 drivers/md/raid10.c    |  4 ++--
 drivers/md/raid5.c     |  2 +-
 include/linux/blkdev.h |  1 +
 10 files changed, 45 insertions(+), 22 deletions(-)

-- 
2.7.4




More information about the dm-devel mailing list