[dm-devel] [PATCH v2 0/2] dm: Avoid use-after-free of a mapped device

Thu Feb 28 13:00:00 UTC 2013

On 02/28/13 01:42, Jun'ichi Nomura wrote:
> Hi Bart,
>
> On 02/27/13 23:45, Bart Van Assche wrote:
>> This mini-series of two patches avoids that the device mapper
>> implementation can trigger a use-after-free during removal of a
>> mapped device. The two patches in this series are:
>> - block: Convert blk_run_queue() recursion into iteration.
>> - dm: Avoid running the md queue after the last dm_put().
>>
>> Note: these patches are the result of source reading. As far as I know this issue has not (yet) caused any harm.
>
> Ref-counting of mapped device is like this:
>    - dm depends on the fact that the block device is opened while there
>      is bio/request submitted.  So dm_get/put in dm_blk_open/close is
>      enough to keep mapped device while there are bios.
>    - Request-based target has a tiny window between dm_blk_close()
>      and the end of rq_completed() because the opener may close the device
>      once the last bio completes even if request is still finishing.
>      dm_get/dm_put in dm_start_request/rq_completed closes this window.
>      (See comments in dm_start_request())
>    - So, when dm_put() puts the last reference, there should be no
>      requests in the queue.
>    - If there is no reference to the mapped device, dm_destroy() may
>      start tearing it down.
>      It is ok if there is pending delayed work for the request queue
>      because blk_cleanup_queue() is called before freeing the mapped device
>      and cancels the delayed work.
>
> So as far as blk_run_queue_async() in rq_completed() is concerned,
> it is not a problem from "use-after-free" point of view.

Hello Jun'ichi,

Thanks for the feedback. It is good to know that there is no risk of 
triggering a use-after-free with the current approach.

How about reposting these patches as a performance optimization ? With 
these patches I see a slightly lower latency and slightly higher 
throughput. With a dm-linear mapping on top of a RAM disk (brd), a 
request size of 512 bytes and 100% reads fio reports 2063K IOPS without 
these patches and 2083K IOPS with these two patches applied. That's an 
improvement of about 1%. It's not much but that comes on top of the 
advantage that these two patches make the rq_completed() implementation 
easier to understand and to reason about.

Bart.