[dm-devel] [PATCH 4/4] dm: support I/O polling
Mike Snitzer
snitzer at redhat.com
Fri Mar 5 18:09:38 UTC 2021
On Fri, Mar 05 2021 at 12:56pm -0500,
Heinz Mauelshagen <heinzm at redhat.com> wrote:
>
> On 3/5/21 6:46 PM, Heinz Mauelshagen wrote:
> >On 3/5/21 10:52 AM, JeffleXu wrote:
> >>
> >>On 3/3/21 6:09 PM, Mikulas Patocka wrote:
> >>>
> >>>On Wed, 3 Mar 2021, JeffleXu wrote:
> >>>
> >>>>
> >>>>On 3/3/21 3:05 AM, Mikulas Patocka wrote:
> >>>>
> >>>>>Support I/O polling if submit_bio_noacct_mq_direct returned non-empty
> >>>>>cookie.
> >>>>>
> >>>>>Signed-off-by: Mikulas Patocka <mpatocka at redhat.com>
> >>>>>
> >>>>>---
> >>>>> drivers/md/dm.c | 5 +++++
> >>>>> 1 file changed, 5 insertions(+)
> >>>>>
> >>>>>Index: linux-2.6/drivers/md/dm.c
> >>>>>===================================================================
> >>>>>--- linux-2.6.orig/drivers/md/dm.c 2021-03-02
> >>>>>19:26:34.000000000 +0100
> >>>>>+++ linux-2.6/drivers/md/dm.c 2021-03-02 19:26:34.000000000 +0100
> >>>>>@@ -1682,6 +1682,11 @@ static void __split_and_process_bio(stru
> >>>>> }
> >>>>> }
> >>>>> + if (ci.poll_cookie != BLK_QC_T_NONE) {
> >>>>>+ while (atomic_read(&ci.io->io_count) > 1 &&
> >>>>>+ blk_poll(ci.poll_queue, ci.poll_cookie, true)) ;
> >>>>>+ }
> >>>>>+
> >>>>> /* drop the extra reference count */
> >>>>> dec_pending(ci.io, errno_to_blk_status(error));
> >>>>> }
> >>>>It seems that the general idea of your design is to
> >>>>1) submit *one* split bio
> >>>>2) blk_poll(), waiting the previously submitted split bio complets
> >>>No, I submit all the bios and poll for the last one.
> >>>
> >>>>and then submit next split bio, repeating the above process.
> >>>>I'm afraid
> >>>>the performance may be an issue here, since the batch every time
> >>>>blk_poll() reaps may decrease.
> >>>Could you benchmark it?
> >>I only tested dm-linear.
> >>
> >>The configuration (dm table) of dm-linear is:
> >>0 1048576 linear /dev/nvme0n1 0
> >>1048576 1048576 linear /dev/nvme2n1 0
> >>2097152 1048576 linear /dev/nvme5n1 0
> >>
> >>
> >>fio script used is:
> >>```
> >>$cat fio.conf
> >>[global]
> >>name=iouring-sqpoll-iopoll-1
> >>ioengine=io_uring
> >>iodepth=128
> >>numjobs=1
> >>thread
> >>rw=randread
> >>direct=1
> >>registerfiles=1
> >>hipri=1
> >>runtime=10
> >>time_based
> >>group_reporting
> >>randrepeat=0
> >>filename=/dev/mapper/testdev
> >>bs=4k
> >>
> >>[job-1]
> >>cpus_allowed=14
> >>```
> >>
> >>IOPS (IRQ mode) | IOPS (iopoll mode (hipri=1))
> >>--------------- | --------------------
> >> 213k | 19k
> >>
> >>At least, it doesn't work well with io_uring interface.
> >>
> >>
> >
> >
> >Jeffle,
> >
> >I ran your above fio test on a linear LV split across 3 NVMes to
> >second your split mapping
> >(system: 32 core Intel, 256GiB RAM) comparing io engines sync,
> >libaio and io_uring,
> >the latter w/ and w/o hipri (sync+libaio obviously w/o
> >registerfiles and hipri) which resulted ok:
> >
> >
> >
> >sync | libaio | IRQ mode (hipri=0) | iopoll (hipri=1)
> >------|----------|---------------------|----------------- 56.3K
> >| 290K | 329K | 351K I can't second
> >your drastic hipri=1 drop here...
>
>
> Sorry, email mess.
>
>
> sync | libaio | IRQ mode (hipri=0) | iopoll (hipri=1)
> -------|----------|---------------------|-----------------
> 56.3K | 290K | 329K | 351K
>
>
>
> I can't second your drastic hipri=1 drop here...
I think your result is just showcasing your powerful system's ability to
poll every related HW queue.. whereas Jeffle's system is likely somehow
more constrained (on a cpu level, memory, whatever).
My basis for this is that Mikulas' changes simply always return an
invalid cookie (BLK_QC_T_NONE) for purposes of intelligent IO polling.
Such an implementation is completely invalid.
I discussed with Jens and he said:
"it needs to return something that f_op->iopoll() can make sense of.
otherwise you have no option but to try everything."
Mike
More information about the dm-devel
mailing list