[dm-devel] [PATCH 5/8] dm: always setup ->orig_bio in alloc_io

Mike Snitzer snitzer at redhat.com
Wed Apr 13 06:12:47 UTC 2022


On Tue, Apr 12 2022 at  9:56P -0400,
Ming Lei <ming.lei at redhat.com> wrote:

> On Tue, Apr 12, 2022 at 04:52:40PM -0400, Mike Snitzer wrote:
> > On Tue, Apr 12 2022 at  4:56P -0400,
> > Ming Lei <ming.lei at redhat.com> wrote:
> > 
> > > The current DM codes setup ->orig_bio after __map_bio() returns,
> > > and not only cause kernel panic for dm zone, but also a bit ugly
> > > and tricky, especially the waiting until ->orig_bio is set in
> > > dm_submit_bio_remap().
> > > 
> > > The reason is that one new bio is cloned from original FS bio to
> > > represent the mapped part, which just serves io accounting.
> > > 
> > > Now we have switched to bdev based io accounting interface, and we
> > > can retrieve sectors/bio_op from both the real original bio and the
> > > added fields of .sector_offset & .sectors easily, so the new cloned
> > > bio isn't necessary any more.
> > > 
> > > Not only fixes dm-zone's kernel panic, but also cleans up dm io
> > > accounting & split a bit.
> > 
> > You're conflating quite a few things here.  DM zone really has no
> > business accessing io->orig_bio (dm-zone.c can just as easily inspect
> > the tio->clone, because it hasn't been remapped yet it reflects the
> > io->origin_bio, so there is no need to look at io->orig_bio) -- but
> > yes I clearly broke things during the 5.18 merge and it needs fixing
> > ASAP.
> 
> You can just consider the cleanup part of this patches, :-)

I will.  But your following list doesn't reflect any "cleanup" that I
saw in your patchset.  Pretty fundamental changes that are similar,
but different, to the dm-5.19 changes I've staged.

> 1) no late assignment of ->orig_bio, and always set it in alloc_io()
>
> 2) no waiting on on ->origi_bio, especially the waiting is done in
> fast path of dm_submit_bio_remap().

For 5.18 waiting on io->orig_bio just enables a signal that the IO was
split and can be accounted.

For 5.19 I also plan on using late io->orig_bio assignment as an
alternative to the full-blown refcounting currently done with
io->io_count.  I've yet to quantify the gains with focused testing but
in theory this approach should scale better on large systems with many
concurrent IO threads to the same device (RCU is primary constraint
now).

I'll try to write a bpfrace script to measure how frequently "waiting on
io->orig_bio" occurs for dm_submit_bio_remap() heavy usage (like
dm-crypt). But I think we'll find it is very rarely, if ever, waited
on in the fast path.

> 3) no split for io accounting

DM's more recent approach to splitting has never been done for benefit
or use of IO accounting, see this commit for its origin:
18a25da84354c6b ("dm: ensure bio submission follows a depth-first tree walk")

Not sure why you keep poking fun at DM only doing a single split when:
that is the actual design.  DM splits off orig_bio then recurses to
handle the remainder of the bio that wasn't issued.  Storing it in
io->orig_bio (previously io->bio) was always a means of reflecting
things properly. And yes IO accounting is one use, the other is IO
completion. But unfortunately DM's IO accounting has always been a
mess ever since the above commit. Changes in 5.18 fixed that.

But again, DM's splitting has _nothing_ to do with IO accounting.
Splitting only happens when needed for IO submission given constraints
of DM target(s) or underlying layers.

All said, I will look closer at your entire set and see if it better
to go with your approach.  This patch in particular is interesting
(avoids cloning and other complexity of bio_split + bio_chain):
https://patchwork.kernel.org/project/dm-devel/patch/20220412085616.1409626-6-ming.lei@redhat.com/



More information about the dm-devel mailing list