[dm-devel] [PATCH 1/3] dm: don't grab target io reference in dm_zone_map_bio

Ming Lei ming.lei at redhat.com
Tue Apr 12 00:09:08 UTC 2022


On Tue, Apr 12, 2022 at 08:33:04AM +0900, Damien Le Moal wrote:
> On 4/11/22 23:18, Ming Lei wrote:
> >>>> This fixes the issue:
> >>>>
> >>>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> >>>> index 3c5fad7c4ee6..3dd6735450c5 100644
> >>>> --- a/drivers/md/dm.c
> >>>> +++ b/drivers/md/dm.c
> >>>> @@ -581,7 +581,7 @@ static struct dm_io *alloc_io(struct mapped_device
> >>>> *md, struct bio *bio)
> >>>>         io->status = 0;
> >>>>         atomic_set(&io->io_count, 1);
> >>>>         this_cpu_inc(*md->pending_io);
> >>>> -       io->orig_bio = NULL;
> >>>> +       io->orig_bio = bio;
> >>>>         io->md = md;
> >>>>         io->map_task = current;
> >>>>         spin_lock_init(&io->lock);
> >>>>
> >>>> Otherwise, the dm-zone.c code sees a NULL orig_bio.
> >>>> However, this change may be messing up the bio accounting. Need to check that.
> >>>
> >>> Looks it is one recent regression since:
> >>>
> >>> commit 0fbb4d93b38b ("dm: add dm_submit_bio_remap interface")
> >>
> >> Yep, saw that. Problem is, I really do not understand that change setting
> >> io->orig_bio *after* __map_bio() is called. It seems that the accounting
> >> is done on each fragment of the orig_bio instead of once for the entire
> >> BIO... So my "fix" above seems wrong. Apart from passing along orig_bio as
> >> an argument to  __map_bio() from __split_and_process_bio(), I do not think
> >> my change is correct. Thoughts ?
> > 
> > Frankly speaking, both changing ->orig_bio after split and setting ->orig_bio
> > after ->map() looks ugly & tricky, and the following change should avoid the
> > issue, meantime simplify dm accounting a bit:
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index 3c5fad7c4ee6..f1fe83113608 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -581,7 +581,7 @@ static struct dm_io *alloc_io(struct mapped_device *md, struct bio *bio)
> >  	io->status = 0;
> >  	atomic_set(&io->io_count, 1);
> >  	this_cpu_inc(*md->pending_io);
> > -	io->orig_bio = NULL;
> > +	io->orig_bio = bio;
> >  	io->md = md;
> >  	io->map_task = current;
> >  	spin_lock_init(&io->lock);
> > @@ -1223,19 +1223,11 @@ void dm_submit_bio_remap(struct bio *clone, struct bio *tgt_clone)
> >  	 * Account io->origin_bio to DM dev on behalf of target
> >  	 * that took ownership of IO with DM_MAPIO_SUBMITTED.
> >  	 */
> > -	if (io->map_task == current) {
> > +	if (io->map_task == current)
> >  		/* Still in target's map function */
> >  		dm_io_set_flag(io, DM_IO_START_ACCT);
> > -	} else {
> > -		/*
> > -		 * Called by another thread, managed by DM target,
> > -		 * wait for dm_split_and_process_bio() to store
> > -		 * io->orig_bio
> > -		 */
> > -		while (unlikely(!smp_load_acquire(&io->orig_bio)))
> > -			msleep(1);
> > +	else
> 
> Curly brackets around the else here.
> 
> >  		dm_start_io_acct(io, clone);
> > -	}
> >  
> >  	__dm_submit_bio_remap(tgt_clone, disk_devt(io->md->disk),
> >  			      tio->old_sector);
> > @@ -1562,7 +1554,7 @@ static void dm_split_and_process_bio(struct mapped_device *md,
> >  				     struct dm_table *map, struct bio *bio)
> >  {
> >  	struct clone_info ci;
> > -	struct bio *orig_bio = NULL;
> > +	struct bio *new_bio = NULL;
> >  	int error = 0;
> >  
> >  	init_clone_info(&ci, md, map, bio);
> > @@ -1578,22 +1570,14 @@ static void dm_split_and_process_bio(struct mapped_device *md,
> >  	if (error || !ci.sector_count)
> >  		goto out;
> >  
> > -	/*
> > -	 * Remainder must be passed to submit_bio_noacct() so it gets handled
> > -	 * *after* bios already submitted have been completely processed.
> > -	 * We take a clone of the original to store in ci.io->orig_bio to be
> > -	 * used by dm_end_io_acct() and for dm_io_complete() to use for
> > -	 * completion handling.
> > -	 */
> 
> This comment should remain with some adjustment.

Fine, just felt the approach is very straightforward.

> 
> > -	orig_bio = bio_split(bio, bio_sectors(bio) - ci.sector_count,
> > -			     GFP_NOIO, &md->queue->bio_split);
> > -	bio_chain(orig_bio, bio);
> > -	trace_block_split(orig_bio, bio->bi_iter.bi_sector);
> > -	submit_bio_noacct(bio);
> > +	new_bio = bio_alloc_clone(bio->bi_bdev, bio, GFP_NOIO,
> > +			&md->queue->bio_split);
> 
> Why not keep using bio_split() ?

With bio_split(), 'bio' actually tracks the remainder, and the returned
'orig_bio' tracks the part for current target io, so ->orig_bio has to
be updated in this way.

With bio_alloc_clone() and bio_trim(), ->orig_bio needn't to be
changed, and assigning it in alloc_io() works perfectly.

> 
> > +	bio_trim(new_bio, bio_sectors(bio) - ci.sector_count, ci.sector_count);
> > +	bio_trim(bio, 0, bio_sectors(bio) - ci.sector_count);
> > +	bio_chain(new_bio, bio);
> > +	trace_block_split(new_bio, new_bio->bi_iter.bi_sector);
> > +	submit_bio_noacct(new_bio);
> >  out:
> > -	if (!orig_bio)
> > -		orig_bio = bio;
> > -	smp_store_release(&ci.io->orig_bio, orig_bio);
> >  	if (dm_io_flagged(ci.io, DM_IO_START_ACCT))
> >  		dm_start_io_acct(ci.io, NULL);
> 
> I tested this and it works. Need to check the accounting though.
> And I agree this is a lot cleaner :)

BTW, the cloned bio for split is just for accounting purpose, if
->bi_iter.bi_sector & ->bi_iter.bi_size can be stored in 'struct dm_io',
the cloned bio can be avoided, but code may become not readable as
before.


Thanks,
Ming


More information about the dm-devel mailing list