[dm-devel] [PATCH 4/4] dm: implement no-clone optimization
Mikulas Patocka
mpatocka at redhat.com
Fri Feb 15 14:09:16 UTC 2019
On Thu, 14 Feb 2019, Mike Snitzer wrote:
> On Thu, Feb 14 2019 at 11:54am -0500,
> Mikulas Patocka <mpatocka at redhat.com> wrote:
>
> > > > x86-64, 2x six-core
> > > > /dev/ram0 2449MiB/s
> > > > /dev/mapper/lin 5.0-rc without optimization 1970MiB/s
> > > > /dev/mapper/lin 5.0-rc with optimization 2238MiB/s
> > > >
> > > > arm64, quad core:
> > > > /dev/ram0 457MiB/s
> > > > /dev/mapper/lin 5.0-rc without optimization 325MiB/s
> > > > /dev/mapper/lin 5.0-rc with optimization 364MiB/s
> > > >
> > > > Signed-off-by: Mikulas Patocka <mpatocka at redhat.com>
> > >
> > > Nice performance improvement. But each device should have its own
> > > mempool for dm_noclone + front padding. So it should be wired into
> > > dm_alloc_md_mempools().
> >
> > We don't need to use mempools - if the slab allocation fails, we fall back
> > to the cloning path that has mempools.
>
> But the implementation benefits from each DM device having control over
> any extra memory it'd like to use for front padding. Same as is done
> now for the full-blown DM core with cloning.
If the machine is out of memory, you alredy have much more serious
problems to deal with - attempting to optimize I/O by 13% doesn't make
sense.
> > > It is fine if you don't actually deal with supporting per-bio-data in
> > > this patch, but a follow-on patch to add support for noclone-based
> > > per-bio-data shouldn't be expected to refactor the location of the
> > > mempool allocation (module vs per-device granularity).
> > >
> > > Mike
> >
> > I tried to use per-bio-data and other features - and it makes the
> > structure dm_noclone and function noclone_endio grow:
> >
> > #define DM_NOCLONE_MAGIC 9693664
> > struct dm_noclone {
> > struct mapped_device *md;
> > struct dm_target *ti;
> > struct bio *bio;
> > struct bvec_iter orig_bi_iter;
> > bio_end_io_t *orig_bi_end_io;
> > void *orig_bi_private;
> > unsigned long start_time;
> > /* ... per-bio data ... */
> > /* DM_NOCLONE_MAGIC */
> > };
> >
> > And this growth degrades performance on linear target - from 2238MiB/s to
> > 2145MiB/s.
>
> It shouldn't if done properly.. for linear there wouldn't be any growth.
That means variable structure length depending on target?
Other targets are so slow that they don't need this optimization at all -
for example dm-thin has 80 - 110MiB/s for the same use case - an
optimization that improves performance of linear by 13% has no effect
here.
If we had a target that performs as well as linear or striped, this
optimization could be enabled for it.
Mikulas
More information about the dm-devel
mailing list