[vdo-devel] [RFA] strange 'data blocks used' behavior in vdostats

Louis Imershein limershe at redhat.com
Tue Oct 20 17:08:25 UTC 2020


One thing I've seen in the past which can result in strange behavior is
that 4K aligned runs of zeros (zero blocks) get eliminated even with
deduplication disabled.
Could blocks like this exist for your larger files?  Just a thought.

-louis

On Tue, Oct 20, 2020 at 9:24 AM Sweet Tea Dorminy <sweettea at redhat.com>
wrote:

> Hi Philipp:
>
> Quick question: You mentioned writing the same file to VDO over and
> over; are you using a filesystem atop VDO, or are you using dd or
> equivalent to write the file to the VDO device at different offsets?
>
> Thanks!
>
> On Tue, Oct 20, 2020 at 12:04 PM Philipp Rudo <prudo at linux.ibm.com> wrote:
> >
> > Hi everybody,
> >
> > I'm a kernel developer for s390. Together with Leon, I'm currently
> trying to
> > evaluate the cost & gains from using deflate instead of lz4 in vdo. The
> idea is
> > to make use of the in hw deflate implementation introduced with our
> latest machine
> > generation. In this effort we are currently running some tests on the
> > compression ratio which show a rather peculiar behavior we don't
> understand. So
> > we are reaching out to you in the hope you can help us finding out
> what's going
> > on.
> >
> > In our test (details below) we simply copy the same file (~5 MB) to a
> vdo device
> > until we reach the target logical size (deduplication disabled). Then we
> wait
> > till the packer is finished and get the vdostats. As
> > block size << file size << target size we expected to get a constant
> 'saving
> > percent'. But what we see is that the 'saving percent' starts high, has a
> > minimum at ~10GB and then grows again (seemingly logarithmic). While the
> > behavior for small sizes can be explained by a constant overhead the
> > logarithmic grows for large sizes looks odd to us.
> >
> > Looking at the raw data we noticed that the 'logical blocks used' grow
> linear
> > with the target size (as expected) while the 'data blocks used' have a
> rather
> > irrational behavior. What especially surprises us is that the 'data
> blocks
> > used' reach a peak at ~20GB and then go down again. So although more of
> the
> > same data is compressed with the same algorithm less disk space is used?
> >
> > We can reproduce this behavior with our prototype (both algorithms), an
> official
> > RHEL 8.3 build and different files.
> >
> > Do you have an idea what causes this behavior? Are we missing something
> > fundamental?
> >
> > Thanks and sorry for the long mail
> > Philipp
> >
> > ----
> > Test details:
> >
> > OS: RHEL 8.3 Snapshot 3
> > kernel: 4.18.0-235.el8.s390x
> > vdo: 6.2.3.114-14.el8
> > file: bible.txt from
> http://corpus.canterbury.ac.nz/resources/large.tar.gz
> >
> > size (in MB)    logical blocks used     data blocks used        saving
> percent
> > 100             296241                  22395                   92.4%
> > 1000            526688                  231178                  56.1%
> > 2000            782848                  462765                  40.8%
> > 4000            1295170                 931232                  28.1%
> > 6000            1807495                 1405123                 22.2%
> > 8000            2318824                 1865933                 19.5%
> > 10000           2831146                 2268763                 19.8%
> > 12000           3343470                 2503638                 25.1%
> > 14000           3854801                 2534747                 34.2%
> > 16000           4367119                 2607669                 40.2%
> > 18000           4879445                 2821335                 42.1%
> > 20000           5390780                 2824725                 47.6%
> > 22000           5903107                 2909582                 50.7%
> > 24000           6415433                 2727539                 57.4%
> > 26000           6927756                 2681278                 61.2%
> > 28000           7439083                 2647278                 64.4%
> > 30000           7951405                 2433786                 69.3%
> > 32000           8463724                 2428650                 71.3%
> >
> > _______________________________________________
> > vdo-devel mailing list
> > vdo-devel at redhat.com
> > https://www.redhat.com/mailman/listinfo/vdo-devel
> >
>
> _______________________________________________
> vdo-devel mailing list
> vdo-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/vdo-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vdo-devel/attachments/20201020/e5285f8c/attachment.htm>


More information about the vdo-devel mailing list