[Vdo-devel] Trying to test thin provisioned LVM on VDO

Wed Jul 11 13:54:48 UTC 2018

That's fine.

On Wed, Jul 11 2018 at  9:38am -0400,
Nikhil Kshirsagar <nkshirsa at redhat.com> wrote:

>    Hello,
>    Would it be a good idea to document this in a kcs and also raise a bz
>    preemptively?
>    Regards,
>    Nikhil.
>    On Wed 11 Jul, 2018, 7:05 PM Mike Snitzer, <[1]snitzer at redhat.com> wrote:
> 
>      On Wed, Jul 11 2018 at  6:48am -0400,
>      James Hogarth <[2]james.hogarth at gmail.com> wrote:
> 
>      > On 11 July 2018 at 11:26, James Hogarth <[3]james.hogarth at gmail.com>
>      wrote:
>      > > On 11 July 2018 at 10:40, Michael Sclafani <[4]sclafani at redhat.com>
>      wrote:
>      > >> Based on the error message and a quick scan of the code, it appears
>      dm-thin
>      > >> disables discards because VDO's max_discard_sectors = 4KB is
>      smaller than
>      > >> dm-thin's 64KB+ block size. I have no idea why it does that, but if
>      it
>      > >> neither discards nor zeros out blocks it has written to VDO, that
>      space will
>      > >> not be reclaimed.
>      > >>
>      > >
>      > > Thanks for confirming the line of thought I was following ...
>      > >
>      > > Annoyingly this makes the RHEL documentation pretty useless to
>      follow
>      > > for carrying out thin provisioned volumes...
>      > >
>      > > Unfortunately I don't have a support account to hand to raise this
>      as
>      > > a RHEL7.5 issue to resolve ...
>      > >
>      > > Looking at the lvcreate man page it's not possible to set a block
>      size
>      > > for a thin pool below 64K
>      > >
>      > > -c|--chunksize Size[k|UNIT]
>      > >               The size of chunks in a snapshot, cache pool or thin
>      > > pool.  For snapshots, the value
>      > >               must be a power of 2 between 4KiB and 512KiB and the
>      > > default value is 4.  For a cache
>      > >               pool the value must be between 32KiB and 1GiB and the
>      > > default value is 64.  For a thin
>      > >               pool the value must be between 64KiB and 1GiB and the
>      > > default value starts with 64 and
>      > >               scales up to fit the pool metadata size within 128MiB,
>      > > if the pool metadata size is not
>      > >               specified.  The value must be a multiple of 64KiB.
>      See
>      > > lvmthin(7) and lvmcache(7) for
>      > >               more information.
>      > >
>      > > What's going to be the best approach to resolve this so that thin
>      > > provisioning works as expected? It's obviously not advisable to use
>      in
>      > > this configuration due to the inevitable disk exhaustion issue that
>      > > will arise.
>      >
>      >
>      > Mike you wrote the relevant patch that appears to be causing the
>      > conflict and prevents dm-thin passing the discard to VDO here:
>      >
>      > [5]https://www.redhat.com/archives/dm-devel/2012-August/msg00381.html
>      >
>      > I know it was a while back but do you recall what the reason for the
>      > max_discard_sector and sectors_per_block comparison was for?
> 
>      DM thinp cannot make use of a discard that only cover part of a dm-thinp
>      block.  SO its internal accounting wouldn't work.
> 
>      Now in the VDO case, you still _really_ want the discard (that DM thinp
>      cannot use, and as such will not reclaim and reuse the associated block)
>      to get passed down -- so VDO can recover space, etc.
> 
>      > From the VDO code it appears untenable to increase maxDiscardSector
>      > without major performance impact - to the extent of I/O stalls.
> 
>      That needs to be explored further.  Only allowing 4K discards is also a
>      serious source of performance loss (by forcing the block core's
>      bldev_issue_discard to iterate on such a small granularity).
> 
>      Pretty sure Zdenek found that VDO's discard performance was _very_
>      slow.
> 
>      > So it looks like the only way to make this work is a change to dm-thin
>      > to ensure the discards are still passed to the VDO layer below it.
> 
>      Not opposed to adding that.  Think it'll require a new feature though,
>      e.g. "discard_passdown".  We already have "no_discard_passdown" -- which
>      is safe, whereas "discard_passdown" could be unsafe (if device simply
>      doesn't support diacrds at all).. so the constraint for the
>      "discard_passdown" override must be that the pool's underlying data
>      device does actually support discard.
> 
>      But all said, discard passdown happens as a side-effect at the end of
>      dm-thinp's discard processing (that is all done in terms of
>      dm-bio-prison locking that occurs at a thinp blocksize granularity).  As
>      such it could become quite complex to update dm-thinp's discard
>      code-path to process discards that don't cover an entire thinp block.
>      Might not be awful, but just letting you know as an upfront disclaimer.
> 
>      Another option might be to see what shit hits the fan if we were to
>      relax the DM thinp blocksize all the way down to 4K.  It'll definitely
>      put pressure on the thinp metadata, etc.  Could result in serious
>      performance hit, and more side-effects I cannot devine at the moment.
>      But it is a "cheap" way forward.. but in general we'd probably want to
>      gate the use of such a small a blocksize on some sort of
>      i-know-what-i'm-doing feature.
> 
>      Mike
> 
> References
> 
>    Visible links
>    1. mailto:snitzer at redhat.com
>    2. mailto:james.hogarth at gmail.com
>    3. mailto:james.hogarth at gmail.com
>    4. mailto:sclafani at redhat.com
>    5. https://www.redhat.com/archives/dm-devel/2012-August/msg00381.html