[dm-devel] [PATCH RFCv2 01/10] dm-dedup: main data structures

Darrick J. Wong darrick.wong at oracle.com
Thu Dec 4 01:21:50 UTC 2014


On Wed, Nov 26, 2014 at 01:35:34PM -0500, Vasily Tarasov wrote:
> Hi Mike, Vivek,
> 
> Sounds good, thanks for looking into this!
> 
> At this point we don't have a dedup_checker. Could you clarify a bit
> on the main use case for a cheker? Sudden power loss or accidental
> corruption of metadata/ data devices?

<shrug> No replies for a week, so I'll wade in.  Keep in mind I'm a FS
developer, so don't take my replies as necessarily matching Mike or
Vivek's goals.

> In dm-dedup, metadata is stored using dm's persistent-data library
> (COW B-trees). Data blocks are written asynchronously with meta-data
> but allocated sequentially. So, theoretically, on a sudden power loss
> the state of a dm-dedup should remain consistent.

Theoretically, yes. :)

> But if somebody corrupts metadata/data devices manually the checker
> will help. Is it the main use case?

Or if the storage corrupts itself and you want/need to run a
consistency checker to scrape the broken crud off the disk so that you
can recover whatever's left.  There are also cases such as recovering
from accidental reformats (if possible); patching things up after the
kernel explodes midway through some operation; fixing up the mess
after your own software bugs out; and recovering when the storage
miswrites blocks to the wrong place.

It would also be useful to verify that a block still matches its
stored hash; that for all LBN->PBN mappings there's also a hash->PBN
mapping; and (optionally) to garbage collect any hash->PBN mappings.
Theoretically you could also defrag the device.  Maybe this can even
be done in a background kernel thread (ha!), since the metadata's
already sitting around in memory.

> We'll definitely take a look into the verifier's code for thin and
> cache targets and see how this applies to dm-dedup.

Looks promising so far, aside from the things I noted in yesterday's
email.  Thanks for contributing all this work!

--D

> 
> Thanks,
> Vasily
> 
> On Wed, Nov 26, 2014 at 11:47 AM, Mike Snitzer <snitzer at redhat.com> wrote:
> > On Wed, Nov 26 2014 at 11:36am -0500,
> > Erez Zadok <ezk at fsl.cs.sunysb.edu> wrote:
> >
> >> Mike, Vivek,
> >>
> >> Thank you for the effort and especially for adding more man-power to
> >> this review.  We know how busy you guys are so it’s understandable
> >> that things can take a while to get started.  Either way, I’ve
> >> instructed my students to give this project the highest priority,
> >> especially once we receive comments from you.
> >
> > Great.  So along those lines have you guys worked on userspace tools
> > that can verify/repair the ondisk metadata?
> >
> > That will be a prereq for upstream inclusion (at least for dm-dedup to
> > become anything but "experimental").
> >
> > dm-cache and dm-thin targets have these types of tools
> > (thin_{check,repair}, cache_{check,repair}, etc).  Upstream repo is here
> > (misnamed, gets packaged into device-mapper-persistent-data rpm on
> > Fedora, RHEL, CentOS, etc):
> > https://github.com/jthornber/thin-provisioning-tools
> >
> > Mike
> >
> 
> --
> dm-devel mailing list
> dm-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel




More information about the dm-devel mailing list