[dm-devel] can we reduce bio_set_dev overhead due to bio_associate_blkg?

Mike Snitzer snitzer at kernel.org
Wed Mar 30 16:52:58 UTC 2022


Hey Tejun and Dennis,

I recently found that due to bio_set_dev()'s call to
bio_associate_blkg(), bio_set_dev() needs much more cpu than ideal;
especially when doing 4K IOs via io_uring's HIPRI bio-polling.

I'm very naive about blk-cgroups.. so I'm hopeful you or others can
help me cut through this to understand what the ideal outcome should
be for DM's bio clone + remap heavy use-case as it relates to
bio_associate_blkg.

If I hack dm-linear with a local __bio_set_dev that simply removes
the call to bio_associate_blkg() my IOPS go from ~980K to 995K.

Looking at what is happening a bit, relative to this DM bio cloning
usecase, it seems __bio_clone() calls bio_clone_blkg_association() to
clone the blkg from DM device, then dm-linear.c:linear_map's call
to bio_set_dev() will cause bio_associate_blkg(bio) to reuse the css
but then it triggers an update because the bdev is being remapped in
the bio (due to linear_map sending the IO to the real underlying
device). End result _seems_ like collective wasteful effort to get the
blk-cgroup resources setup properly in the face of a simple remap.

Seems the current DM pattern is causing repeat blkg work for _every_
remapped bio?  Do you see a way to speed up repeat calls to
bio_associate_blkg()?

Test kernel is my latest dm-5.19 branch (though latest Linus 5.18-rc0
kernel should be fine too):
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-5.19

I'm using dm-linear ontop on a 16G blk-mq null_blk device:

modprobe null_blk queue_mode=2 poll_queues=2 bs=4096 gb=16
SIZE=`blockdev --getsz /dev/nullb0`
echo "0 $SIZE linear /dev/nullb0 0" | dmsetup create linear

And running the workload with fio using this wrapper script:
io_uring.sh 20 1 /dev/mapper/linear 4096

#!/bin/bash

RTIME=$1
JOBS=$2
DEV=$3
BS=$4

QD=64
BATCH=16
HI=1

fio --bs=$BS --ioengine=io_uring --fixedbufs --registerfiles --hipri=$HI \
        --iodepth=$QD \
        --iodepth_batch_submit=$BATCH \
        --iodepth_batch_complete_min=$BATCH \
        --filename=$DEV \
        --direct=1 --runtime=$RTIME --numjobs=$JOBS --rw=randread \
        --name=test --group_reporting



More information about the dm-devel mailing list