[Vdo-devel] [lvm-team] Trying to test thin provisioned LVM on VDO

Wed Jul 11 13:38:21 UTC 2018

Hello,

Would it be a good idea to document this in a kcs and also raise a bz
preemptively?

Regards,
Nikhil.

On Wed 11 Jul, 2018, 7:05 PM Mike Snitzer, <snitzer at redhat.com> wrote:

> On Wed, Jul 11 2018 at  6:48am -0400,
> James Hogarth <james.hogarth at gmail.com> wrote:
>
> > On 11 July 2018 at 11:26, James Hogarth <james.hogarth at gmail.com> wrote:
> > > On 11 July 2018 at 10:40, Michael Sclafani <sclafani at redhat.com>
> wrote:
> > >> Based on the error message and a quick scan of the code, it appears
> dm-thin
> > >> disables discards because VDO's max_discard_sectors = 4KB is smaller
> than
> > >> dm-thin's 64KB+ block size. I have no idea why it does that, but if it
> > >> neither discards nor zeros out blocks it has written to VDO, that
> space will
> > >> not be reclaimed.
> > >>
> > >
> > > Thanks for confirming the line of thought I was following ...
> > >
> > > Annoyingly this makes the RHEL documentation pretty useless to follow
> > > for carrying out thin provisioned volumes...
> > >
> > > Unfortunately I don't have a support account to hand to raise this as
> > > a RHEL7.5 issue to resolve ...
> > >
> > > Looking at the lvcreate man page it's not possible to set a block size
> > > for a thin pool below 64K
> > >
> > > -c|--chunksize Size[k|UNIT]
> > >               The size of chunks in a snapshot, cache pool or thin
> > > pool.  For snapshots, the value
> > >               must be a power of 2 between 4KiB and 512KiB and the
> > > default value is 4.  For a cache
> > >               pool the value must be between 32KiB and 1GiB and the
> > > default value is 64.  For a thin
> > >               pool the value must be between 64KiB and 1GiB and the
> > > default value starts with 64 and
> > >               scales up to fit the pool metadata size within 128MiB,
> > > if the pool metadata size is not
> > >               specified.  The value must be a multiple of 64KiB.  See
> > > lvmthin(7) and lvmcache(7) for
> > >               more information.
> > >
> > > What's going to be the best approach to resolve this so that thin
> > > provisioning works as expected? It's obviously not advisable to use in
> > > this configuration due to the inevitable disk exhaustion issue that
> > > will arise.
> >
> >
> > Mike you wrote the relevant patch that appears to be causing the
> > conflict and prevents dm-thin passing the discard to VDO here:
> >
> > https://www.redhat.com/archives/dm-devel/2012-August/msg00381.html
> >
> > I know it was a while back but do you recall what the reason for the
> > max_discard_sector and sectors_per_block comparison was for?
>
> DM thinp cannot make use of a discard that only cover part of a dm-thinp
> block.  SO its internal accounting wouldn't work.
>
> Now in the VDO case, you still _really_ want the discard (that DM thinp
> cannot use, and as such will not reclaim and reuse the associated block)
> to get passed down -- so VDO can recover space, etc.
>
> > From the VDO code it appears untenable to increase maxDiscardSector
> > without major performance impact - to the extent of I/O stalls.
>
> That needs to be explored further.  Only allowing 4K discards is also a
> serious source of performance loss (by forcing the block core's
> bldev_issue_discard to iterate on such a small granularity).
>
> Pretty sure Zdenek found that VDO's discard performance was _very_
> slow.
>
> > So it looks like the only way to make this work is a change to dm-thin
> > to ensure the discards are still passed to the VDO layer below it.
>
> Not opposed to adding that.  Think it'll require a new feature though,
> e.g. "discard_passdown".  We already have "no_discard_passdown" -- which
> is safe, whereas "discard_passdown" could be unsafe (if device simply
> doesn't support diacrds at all).. so the constraint for the
> "discard_passdown" override must be that the pool's underlying data
> device does actually support discard.
>
> But all said, discard passdown happens as a side-effect at the end of
> dm-thinp's discard processing (that is all done in terms of
> dm-bio-prison locking that occurs at a thinp blocksize granularity).  As
> such it could become quite complex to update dm-thinp's discard
> code-path to process discards that don't cover an entire thinp block.
> Might not be awful, but just letting you know as an upfront disclaimer.
>
> Another option might be to see what shit hits the fan if we were to
> relax the DM thinp blocksize all the way down to 4K.  It'll definitely
> put pressure on the thinp metadata, etc.  Could result in serious
> performance hit, and more side-effects I cannot devine at the moment.
> But it is a "cheap" way forward.. but in general we'd probably want to
> gate the use of such a small a blocksize on some sort of
> i-know-what-i'm-doing feature.
>
> Mike
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vdo-devel/attachments/20180711/c009238a/attachment.htm>