[dm-devel] I/O block when removing thin device on the same pool

Mon Jan 25 09:13:10 UTC 2016

Hi,

I had done some experiments with kernel 4.2.8 that I am using for
production right now and kernel 4.4 with commit 3d5f6733 ("dm thin
metadata: speed up discard of partially mapped volumes") for comparison.

All the experiments below are performed with a dm-thin pool (512KB block
size) which is built with a RAID 0 composed by two Intel 480GB SSDs as
metadata device and a zero-target DM device as the data device. The machine
is equipped with an Intel E3-1246 v3 CPU and 16GB ram.

To discard all the mappings of a fully-mapped 10TB thin device with 512KB
block size,
kernel 4.4 takes 6m57s
kernel 4.2.8 takes 6m49s

To delete a fully-mapped 10TB thin device,
kernel 4.4 takes 48s
kernel 4.2.8 takes 47s

In another experiment, I create an empty thin device and a fully-mapped
10TB thin device. Then, I start writing to the empty thin device
sequentially with fio before deleting the fully-mapped thin device. It can
be observed that the write requests get blocked for couple of seconds
(47~48sec) until the deletion process finishes on both kernel 4.2.8 and
kernel 4.4. If we discard all the mappings in parallel with fio instead of
deleting the fully-mapped thin device, write requests will still be blocked
until all discard requests finished. I think this is because that pool's
deferred list is full of all those discard requests and thus having no
spare computation resource for new write requests to the other thin device.
The kworker thread of thinp cause 100% CPU utilisation while processing the
discard requests.

Hope this information helps.

Thanks,
Dennis

2016-01-23 0:43 GMT+08:00 Joe Thornber <thornber at redhat.com>:

> On Fri, Jan 22, 2016 at 02:38:28PM +0100, Lars Ellenberg wrote:
> > We have seen lvremove of thin snapshots sometimes minutes,
> > even ~20 minutes before.
>
> I did some work on speeding up thin removal in autumn '14, in
> particular agressively prefetching metadata pages sped up the tree
> traversal hugely.  Could you confirm you're seeing pauses of this
> duration with currently kernels please?
>
> Obviously any pause, even a few seconds is unacceptable.  Having a
> background kernel worker thread doing the delete, as you describe, is
> the way to go.  But there are complications to do with
> transactionality and crash protection that have prevented me
> implementing it.  I'll think on it some more now I know it's such a
> problem for you.
>
> - Joe
>
> --
> dm-devel mailing list
> dm-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>

-- 
Dennis Yang
QNAP Systems, Inc.
Skype: qnap.dennis.yang
Email: dennisyang at qnap.com
Tel: (+886)-2-2393-5152 ext. 15018
Address: 13F., No.56, Sec. 1, Xinsheng S. Rd., Zhongzheng Dist., Taipei
City, Taiwan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20160125/27907192/attachment.htm>