[dm-devel] Possible data corruption with dm-thin

Dennis Yang dennisyang at qnap.com
Mon Jun 27 09:32:35 UTC 2016


Hi Joe,

2016-06-24 21:55 GMT+08:00 Edward Thornber <thornber at redhat.com>:

> Hi Dennis,
>
> On Tue, Jun 21, 2016 at 03:56:26PM +0800, Dennis Yang wrote:
> > So my question is, does dm-thin have any mechanism to eliminate the race
> > when
> > discarded block is reused right away by another device?
>
> I'll try and recreate your scenario.  The transaction-manager will not
> reallocate a block that's been freed within a transaction so that we
> can always rollback.  So as long as there hasn't been a commit,
> reallocation shouldn't be possible.  This is what normally guards
> allocation, but in your case I think you may be onto something and we
> may have to hold onto the data cell until the passdown discards are
> complete.
>
> - Joe
>

In my experience, this issue is pretty hard to reproduce solely by a
thin-pool
which is built on top of a regular hard disk or RAID, since I rarely
observe the
DISCARD and WRITE coming from different thin devices get reordered in the
lower level. In my system, the thin-pool is built on top of another
device-mapper
device providing data tiering functionality which could delay the DISCARD
request a little bit and lead to a request reorder situation.

Based on the log I had, I suspect that there was metadata commit before
another
thin device reused the discarded block while the DISCARD request was still
being
processed by the lower level stacks. In this case, holding the data cell
until the
passdown discards are complete seems to only protect the discarded block
from
being reallocated by the same thin device which allocates the block in the
first place,
because only those write I/Os that are going to the same device will be
prisoned.

In my opinion, I think maybe the to-be-discard block should only be freed
after the
passdown discards have been complete. I had only written a patch in this
way and
applied it on my site since last week, and the issue had not been seen on
my site
for 72 hours. I am not very confident about whether this patch will cause
any side
effect, I would be highly appreciated if you can share your concern with me.

Thanks for your reply,
Dennis




-- 
Dennis Yang
QNAP Systems, Inc.
Skype: qnap.dennis.yang
Email: dennisyang at qnap.com
Tel: (+886)-2-2393-5152 ext. 15018
Address: 13F., No.56, Sec. 1, Xinsheng S. Rd., Zhongzheng Dist., Taipei
City, Taiwan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20160627/eac61896/attachment.htm>


More information about the dm-devel mailing list