[dm-devel] [PATCH 1/3] dm: don't grab target io reference in dm_zone_map_bio
Ming Lei
ming.lei at redhat.com
Tue Apr 12 00:09:08 UTC 2022
On Tue, Apr 12, 2022 at 08:33:04AM +0900, Damien Le Moal wrote:
> On 4/11/22 23:18, Ming Lei wrote:
> >>>> This fixes the issue:
> >>>>
> >>>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> >>>> index 3c5fad7c4ee6..3dd6735450c5 100644
> >>>> --- a/drivers/md/dm.c
> >>>> +++ b/drivers/md/dm.c
> >>>> @@ -581,7 +581,7 @@ static struct dm_io *alloc_io(struct mapped_device
> >>>> *md, struct bio *bio)
> >>>> io->status = 0;
> >>>> atomic_set(&io->io_count, 1);
> >>>> this_cpu_inc(*md->pending_io);
> >>>> - io->orig_bio = NULL;
> >>>> + io->orig_bio = bio;
> >>>> io->md = md;
> >>>> io->map_task = current;
> >>>> spin_lock_init(&io->lock);
> >>>>
> >>>> Otherwise, the dm-zone.c code sees a NULL orig_bio.
> >>>> However, this change may be messing up the bio accounting. Need to check that.
> >>>
> >>> Looks it is one recent regression since:
> >>>
> >>> commit 0fbb4d93b38b ("dm: add dm_submit_bio_remap interface")
> >>
> >> Yep, saw that. Problem is, I really do not understand that change setting
> >> io->orig_bio *after* __map_bio() is called. It seems that the accounting
> >> is done on each fragment of the orig_bio instead of once for the entire
> >> BIO... So my "fix" above seems wrong. Apart from passing along orig_bio as
> >> an argument to __map_bio() from __split_and_process_bio(), I do not think
> >> my change is correct. Thoughts ?
> >
> > Frankly speaking, both changing ->orig_bio after split and setting ->orig_bio
> > after ->map() looks ugly & tricky, and the following change should avoid the
> > issue, meantime simplify dm accounting a bit:
> >
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index 3c5fad7c4ee6..f1fe83113608 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -581,7 +581,7 @@ static struct dm_io *alloc_io(struct mapped_device *md, struct bio *bio)
> > io->status = 0;
> > atomic_set(&io->io_count, 1);
> > this_cpu_inc(*md->pending_io);
> > - io->orig_bio = NULL;
> > + io->orig_bio = bio;
> > io->md = md;
> > io->map_task = current;
> > spin_lock_init(&io->lock);
> > @@ -1223,19 +1223,11 @@ void dm_submit_bio_remap(struct bio *clone, struct bio *tgt_clone)
> > * Account io->origin_bio to DM dev on behalf of target
> > * that took ownership of IO with DM_MAPIO_SUBMITTED.
> > */
> > - if (io->map_task == current) {
> > + if (io->map_task == current)
> > /* Still in target's map function */
> > dm_io_set_flag(io, DM_IO_START_ACCT);
> > - } else {
> > - /*
> > - * Called by another thread, managed by DM target,
> > - * wait for dm_split_and_process_bio() to store
> > - * io->orig_bio
> > - */
> > - while (unlikely(!smp_load_acquire(&io->orig_bio)))
> > - msleep(1);
> > + else
>
> Curly brackets around the else here.
>
> > dm_start_io_acct(io, clone);
> > - }
> >
> > __dm_submit_bio_remap(tgt_clone, disk_devt(io->md->disk),
> > tio->old_sector);
> > @@ -1562,7 +1554,7 @@ static void dm_split_and_process_bio(struct mapped_device *md,
> > struct dm_table *map, struct bio *bio)
> > {
> > struct clone_info ci;
> > - struct bio *orig_bio = NULL;
> > + struct bio *new_bio = NULL;
> > int error = 0;
> >
> > init_clone_info(&ci, md, map, bio);
> > @@ -1578,22 +1570,14 @@ static void dm_split_and_process_bio(struct mapped_device *md,
> > if (error || !ci.sector_count)
> > goto out;
> >
> > - /*
> > - * Remainder must be passed to submit_bio_noacct() so it gets handled
> > - * *after* bios already submitted have been completely processed.
> > - * We take a clone of the original to store in ci.io->orig_bio to be
> > - * used by dm_end_io_acct() and for dm_io_complete() to use for
> > - * completion handling.
> > - */
>
> This comment should remain with some adjustment.
Fine, just felt the approach is very straightforward.
>
> > - orig_bio = bio_split(bio, bio_sectors(bio) - ci.sector_count,
> > - GFP_NOIO, &md->queue->bio_split);
> > - bio_chain(orig_bio, bio);
> > - trace_block_split(orig_bio, bio->bi_iter.bi_sector);
> > - submit_bio_noacct(bio);
> > + new_bio = bio_alloc_clone(bio->bi_bdev, bio, GFP_NOIO,
> > + &md->queue->bio_split);
>
> Why not keep using bio_split() ?
With bio_split(), 'bio' actually tracks the remainder, and the returned
'orig_bio' tracks the part for current target io, so ->orig_bio has to
be updated in this way.
With bio_alloc_clone() and bio_trim(), ->orig_bio needn't to be
changed, and assigning it in alloc_io() works perfectly.
>
> > + bio_trim(new_bio, bio_sectors(bio) - ci.sector_count, ci.sector_count);
> > + bio_trim(bio, 0, bio_sectors(bio) - ci.sector_count);
> > + bio_chain(new_bio, bio);
> > + trace_block_split(new_bio, new_bio->bi_iter.bi_sector);
> > + submit_bio_noacct(new_bio);
> > out:
> > - if (!orig_bio)
> > - orig_bio = bio;
> > - smp_store_release(&ci.io->orig_bio, orig_bio);
> > if (dm_io_flagged(ci.io, DM_IO_START_ACCT))
> > dm_start_io_acct(ci.io, NULL);
>
> I tested this and it works. Need to check the accounting though.
> And I agree this is a lot cleaner :)
BTW, the cloned bio for split is just for accounting purpose, if
->bi_iter.bi_sector & ->bi_iter.bi_size can be stored in 'struct dm_io',
the cloned bio can be avoided, but code may become not readable as
before.
Thanks,
Ming
More information about the dm-devel
mailing list