[dm-devel] [PATCH v2 5/9] dm: improve noclone bio support
Mike Snitzer
snitzer at redhat.com
Fri Feb 22 16:56:54 UTC 2019
On Fri, Feb 22 2019 at 10:22am -0500,
Mike Snitzer <snitzer at redhat.com> wrote:
> On Fri, Feb 22 2019 at 5:59am -0500,
> Mikulas Patocka <mpatocka at redhat.com> wrote:
>
> > Hi
> >
> > I've fonud out that this patch causes timeout in the lvm test
> > shell/activate-missing-segment.sh and some others.
>
> OK, I can reproduce with:
> make check_local T=shell/activate-missing-segment.sh
>
> Interestingly, it isn't hung in kernel, test can be interrupted.
> And it is hanging at shell/activate-missing-segment.sh's : aux disable_dev "$dev1"
>
> Other tests with "aux disable_dev" hang too
> (e.g. shell/lvcreate-repair.sh, shell/vgreduce-usage.sh)
>
> continued below:
>
> > On Wed, 20 Feb 2019, Mike Snitzer wrote:
> >
> > > Leverage fact that dm_queue_split() enables use of noclone support even
> > > for targets that require additional splitting (via ti->max_io_len).
> > >
> > > To achieve this move noclone processing from dm_make_request() to
> > > dm_process_bio() and check if bio->bi_end_io is already set to
> > > noclone_endio. This also fixes dm_wq_work() to be able to support
> > > noclone bios (because it calls dm_process_bio()).
> > >
> > > While at it, clean up ->map() return processing to be comparable to
> > > __map_bio() and add some code comments that explain why certain
> > > conditionals exist to prevent the use of noclone.
> > >
> > > Signed-off-by: Mike Snitzer <snitzer at redhat.com>
> > > ---
> > > drivers/md/dm.c | 103 ++++++++++++++++++++++++++++++++++----------------------
> > > 1 file changed, 62 insertions(+), 41 deletions(-)
> > >
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index 57919f211acc..d84735be6927 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1764,14 +1764,75 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
> > > * If in ->make_request_fn we need to use blk_queue_split(), otherwise
> > > * queue_limits for abnormal requests (e.g. discard, writesame, etc)
> > > * won't be imposed.
> > > + *
> > > + * For normal READ and WRITE requests we benefit from upfront splitting
> > > + * via dm_queue_split() because we can then leverage noclone support below.
> > > */
> > > - if (current->bio_list) {
> > > + if (current->bio_list && (bio->bi_end_io != noclone_endio)) {
> > > + /*
> > > + * It is fine to use {blk,dm}_queue_split() before opting to use noclone
> > > + * because any ->bi_private and ->bi_end_io will get saved off below.
> > > + * Once noclone_endio() is called the bio_chain's endio will kick in.
> > > + * - __split_and_process_bio() can split further, but any targets that
> > > + * require late _DM_ initiated splitting (e.g. dm_accept_partial_bio()
> > > + * callers) shouldn't set ti->no_clone.
> > > + */
> > > if (is_abnormal_io(bio))
> > > blk_queue_split(md->queue, &bio);
> > > else
> > > dm_queue_split(md, ti, &bio);
> > > }
> > >
> > > + if (dm_table_supports_noclone(map) &&
> > > + (bio_op(bio) == REQ_OP_READ || bio_op(bio) == REQ_OP_WRITE) &&
> > > + likely(!(bio->bi_opf & REQ_PREFLUSH)) &&
> > > + likely(!bio_integrity(bio)) && /* integrity requires specialized processing */
> > > + likely(!dm_stats_used(&md->stats))) { /* noclone doesn't support dm-stats */
> > > + int r;
> > > + struct dm_noclone *noclone;
> > > +
> > > + /* Already been here if bio->bi_end_io is noclone_endio, reentered via dm_wq_work() */
> > > + if (bio->bi_end_io != noclone_endio) {
>
> This conditional is causing it ^
>
> Very strange...
Added some debugging:
Feb 22 10:46:28 thegoat kernel: device-mapper: core: dm_process_bio: 253:2: bio=00000000c70f3c09 bio->bi_iter.bi_sector=0 len=256 bio->bi_end_io != noclone_endio
Feb 22 10:46:28 thegoat kernel: device-mapper: core: dm_process_bio: 253:0: bio=00000000c70f3c09 bio->bi_iter.bi_sector=2048 len=256 bio->bi_end_io = noclone_endio
...
Feb 22 10:46:28 thegoat kernel: device-mapper: core: dm_process_bio: 253:2: bio=00000000da9cadf3 bio->bi_iter.bi_sector=0 len=256 bio->bi_end_io != noclone_endio
Feb 22 10:46:28 thegoat kernel: device-mapper: core: dm_process_bio: 253:0: bio=00000000da9cadf3 bio->bi_iter.bi_sector=2048 len=256 bio->bi_end_io = noclone_endio
So the same noclone bio is entering devices in the stacked DM device
(perfectly normal).
This fixes the issue:
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index a9856f1986df..8fbe4d5ec8e4 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1831,15 +1831,16 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
likely(!bio_integrity(bio)) && /* integrity requires specialized processing */
likely(!dm_stats_used(&md->stats))) { /* noclone doesn't support dm-stats */
int r;
- struct dm_noclone *noclone;
-
- /* Already been here if bio->bi_end_io is noclone_endio, reentered via dm_wq_work() */
- if (bio->bi_end_io != noclone_endio) {
+ /*
+ * Only allocate noclone if in ->make_request_fn, otherwise
+ * leak could occur due to reentering (e.g. from dm_wq_work)
+ */
+ if (current->bio_list) {
+ struct dm_noclone *noclone;
noclone = kmalloc_node(sizeof(*noclone) + ti->per_io_data_size + sizeof(unsigned),
GFP_NOWAIT, md->numa_node_id);
if (unlikely(!noclone))
goto no_fast_path;
-
noclone->md = md;
noclone->start_time = jiffies;
noclone->orig_bi_end_io = bio->bi_end_io;
That said, I've reasoned through other edge cases that need to be dealt
with and will layer other fixes in addition to this one.
More information about the dm-devel
mailing list