[dm-devel] dm: Fix UAF in run_timer_softirq()

Mon Oct 17 19:38:32 UTC 2022

On Mon, Oct 10 2022 at 10:39P -0400,
Luo Meng <luomeng at huaweicloud.com> wrote:

> From: Luo Meng <luomeng12 at huawei.com>
> 
> When dm_resume() and dm_destroy() are concurrent, it will
> lead to UAF.
> 
> One of the concurrency UAF can be shown as below:
> 
>         use                                  free
> do_resume                           |
>   __find_device_hash_cell           |
>     dm_get                          |
>       atomic_inc(&md->holders)      |
>                                     | dm_destroy
> 				    |   __dm_destroy
> 				    |     if (!dm_suspended_md(md))
>                                     |     atomic_read(&md->holders)
> 				    |     msleep(1)
>   dm_resume                         |
>     __dm_resume                     |
>       dm_table_resume_targets       |
> 	pool_resume                 |
> 	  do_waker  #add delay work |
> 				    |     dm_table_destroy
> 				    |       pool_dtr
> 				    |         __pool_dec
>                                     |           __pool_destroy
>                                     |             destroy_workqueue
>                                     |             kfree(pool) # free pool
> 	time out
> __do_softirq
>   run_timer_softirq # pool has already been freed
> 
> This can be easily reproduced using:
>   1. create thin-pool
>   2. dmsetup suspend pool
>   3. dmsetup resume pool
>   4. dmsetup remove_all # Concurrent with 3
> 
> The root cause of UAF bugs is that dm_resume() adds timer after
> dm_destroy() skips cancel timer beause of suspend status. After
> timeout, it will call run_timer_softirq(), however pool has already
> been freed. The concurrency UAF bug will happen.
> 
> Therefore, canceling timer is moved after md->holders is zero.
> 
> Signed-off-by: Luo Meng <luomeng12 at huawei.com>
> ---
>  drivers/md/dm.c | 26 +++++++++++++-------------
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 60549b65c799..379525313628 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -2420,6 +2420,19 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
>  
>  	blk_mark_disk_dead(md->disk);
>  
> +	/*
> +	 * Rare, but there may be I/O requests still going to complete,
> +	 * for example.  Wait for all references to disappear.
> +	 * No one should increment the reference count of the mapped_device,
> +	 * after the mapped_device state becomes DMF_FREEING.
> +	 */
> +	if (wait)
> +		while (atomic_read(&md->holders))
> +			msleep(1);
> +	else if (atomic_read(&md->holders))
> +		DMWARN("%s: Forcibly removing mapped_device still in use! (%d users)",
> +		       dm_device_name(md), atomic_read(&md->holders));
> +
>  	/*
>  	 * Take suspend_lock so that presuspend and postsuspend methods
>  	 * do not race with internal suspend.
> @@ -2436,19 +2449,6 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
>  	dm_put_live_table(md, srcu_idx);
>  	mutex_unlock(&md->suspend_lock);
>  
> -	/*
> -	 * Rare, but there may be I/O requests still going to complete,
> -	 * for example.  Wait for all references to disappear.
> -	 * No one should increment the reference count of the mapped_device,
> -	 * after the mapped_device state becomes DMF_FREEING.
> -	 */
> -	if (wait)
> -		while (atomic_read(&md->holders))
> -			msleep(1);
> -	else if (atomic_read(&md->holders))
> -		DMWARN("%s: Forcibly removing mapped_device still in use! (%d users)",
> -		       dm_device_name(md), atomic_read(&md->holders));
> -
>  	dm_table_destroy(__unbind(md));
>  	free_dev(md);
>  }
> -- 
> 2.31.1
> 

Thanks for the report but your fix seems wrong.  A thin-pool specific
fix seems much more appropriate.  Does this fix the issue?

diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index e76c96c760a9..dc271c107fb5 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -2889,6 +2889,8 @@ static void __pool_destroy(struct pool *pool)
 	dm_bio_prison_destroy(pool->prison);
 	dm_kcopyd_client_destroy(pool->copier);
 
+	cancel_delayed_work_sync(&pool->waker);
+	cancel_delayed_work_sync(&pool->no_space_timeout);
 	if (pool->wq)
 		destroy_workqueue(pool->wq);