[dm-devel] [PATCH RFCv2 00/10] dm-dedup: device-mapper deduplication target

Darrick J. Wong darrick.wong at oracle.com
Wed Dec 3 02:31:32 UTC 2014


On Thu, Aug 28, 2014 at 06:48:28PM -0400, Vasily Tarasov wrote:
> This is a second request for comments for dm-dedup.
> 
> Updates compared to the first submission:
> 
> - code is updated to kernel 3.16
> - construction parameters are now positional (as in other targets)
> - documentation is extended and brought to the same format as in other targets
> 
> Dm-dedup is a device-mapper deduplication target.  Every write coming to the
> dm-dedup instance is deduplicated against previously written data.  For
> datasets that contain many duplicates scattered across the disk (e.g.,
> collections of virtual machine disk images and backups) deduplication provides
> a significant amount of space savings.
> 
> To quickly identify duplicates, dm-dedup maintains an index of hashes for all
> written blocks.  A block is a user-configurable unit of deduplication with a
> recommended block size of 4KB.  dm-dedup's index, along with other
> deduplication metadata, resides on a separate block device, which we refer to
> as a metadata device.  Although the metadata device can be on any block
> device, e.g., an HDD or its own partition, for higher performance we recommend
> to use SSD devices to store metadata.
> 
> Dm-dedup is designed to support pluggable metadata backends.  A metadata
> backend is responsible for storing metadata: LBN-to-PBN and HASH-to-PBN
> mappings, allocation maps, and reference counters.  (LBN: Logical Block
> Number, PBN: Physical Block Number).  Currently we implemented "cowbtree" and
> "inram" backends.  The cowbtree uses device-mapper persistent API to store
> metadata.  The inram backend stores all metadata in RAM as a hash table.
> 
> Detailed design is described here:
> 
> http://www.fsl.cs.sunysb.edu/docs/ols-dmdedup/dmdedup-ols14.pdf
> 
> Our preliminary experiments on real traces demonstrate that Dmdedup can even
> exceed the performance of a disk drive running ext4.  The reasons are that (1)
> deduplication reduces I/O traffic to the data device, and (2) Dmdedup
> effectively sequentializes random writes to the data device.

Hi!  /me starts playing with the patches at:
git://git.fsl.cs.stonybrook.edu/linux-dmdedup.git#dm-dedup-devel

They seem to apply ok to 3.18-rc7, so I got to poke around long enough to have
questions/comments:

Is there a way for it to automatically garbage collect?  I started rewriting
the same block tons of times[1], but then the device filled up and all the
writes stopped.  If I sent the "garbage_collect" message every 15s it wouldn't
wedge like that, but if I let it hang, garbage collecting didn't un-wedge the
wac processes.

Loading with the cowbtree backend caused a crash in target_message (dm core)
with a RIP of zero when I tried to send the garbage_collect message.

It would be nice if one could send discard and (optionally) do checksum
verification on the read path.  I'll look into adding those once I get a better
grasp on what the code is doing.  Fortunately dm-dedup is short. :)

I suspect that this business in my_endio that uses bio_iovec to free the page
isn't going to work with the iterator rework.  When I tried bulk writing 128M
of zeroes to the device, it blew up while trying to free_pages some nonexistent
page.  Fixing it to bio_for_each_segment_all() and free bvec->bv_page gets us
to free the correct page, at least, but the next IO splats.

Thanks for clearing out some of the BUG*()s.

FYI, dm-dedupe might be an easier way to do data block checksumming for ext4,
hence my interest.  I ran the ext4 metadata checksum test and it managed to
finish without any blowups, though xfstests was not so lucky.  Amusingly the
dedupe ratio was ~53 when it finished.

--D

[1] wac.c: http://djwong.org/docs/wac.c
$ gcc -Wall -o wac wac.c
$ ./wac -l 65536 -n32 -m32 -y32 -z32 -f -r $DEDUPE_DEVICE

> 
> Dmdedup is developed by a joint group of researchers from Stony Brook
> University, Harvey Mudd College, and EMC.  See the documentation patch for
> more details.
> 
> Vasily Tarasov (10):
>   dm-dedup: main data structures
>   dm-dedup: core deduplication logic
>   dm-dedup: hash computation
>   dm-dedup: implementation of the read-on-write procedure
>   dm-dedup: COW B-tree backend
>   dm-dedup: inram backend
>   dm-dedup: Makefile changes
>   dm-dedup: Kconfig changes
>   dm-dedup: status function
>   dm-dedup: documentation
> 
>  Documentation/device-mapper/dedup.txt |  205 +++++++
>  drivers/md/Kconfig                    |    8 +
>  drivers/md/Makefile                   |    2 +
>  drivers/md/dm-dedup-backend.h         |  114 ++++
>  drivers/md/dm-dedup-cbt.c             |  755 ++++++++++++++++++++++++++
>  drivers/md/dm-dedup-cbt.h             |   44 ++
>  drivers/md/dm-dedup-hash.c            |  145 +++++
>  drivers/md/dm-dedup-hash.h            |   30 +
>  drivers/md/dm-dedup-kvstore.h         |   51 ++
>  drivers/md/dm-dedup-ram.c             |  580 ++++++++++++++++++++
>  drivers/md/dm-dedup-ram.h             |   43 ++
>  drivers/md/dm-dedup-rw.c              |  248 +++++++++
>  drivers/md/dm-dedup-rw.h              |   19 +
>  drivers/md/dm-dedup-target.c          |  946 +++++++++++++++++++++++++++++++++
>  drivers/md/dm-dedup-target.h          |  100 ++++
>  15 files changed, 3290 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/device-mapper/dedup.txt
>  create mode 100644 drivers/md/dm-dedup-backend.h
>  create mode 100644 drivers/md/dm-dedup-cbt.c
>  create mode 100644 drivers/md/dm-dedup-cbt.h
>  create mode 100644 drivers/md/dm-dedup-hash.c
>  create mode 100644 drivers/md/dm-dedup-hash.h
>  create mode 100644 drivers/md/dm-dedup-kvstore.h
>  create mode 100644 drivers/md/dm-dedup-ram.c
>  create mode 100644 drivers/md/dm-dedup-ram.h
>  create mode 100644 drivers/md/dm-dedup-rw.c
>  create mode 100644 drivers/md/dm-dedup-rw.h
>  create mode 100644 drivers/md/dm-dedup-target.c
>  create mode 100644 drivers/md/dm-dedup-target.h
> 
> --
> dm-devel mailing list
> dm-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel




More information about the dm-devel mailing list