[dm-devel] dm: Devices not having .iterate_devices may want to set max_io_len

Kirill Tkhai ktkhai at virtuozzo.com
Tue Dec 1 08:37:40 UTC 2020


On 30.11.2020 20:30, Mike Snitzer wrote:
> On Wed, Nov 25 2020 at  3:49pm -0500,
> Kirill Tkhai <ktkhai at virtuozzo.com> wrote:
> 
>> After commit 5091cdec56fa "dm: change max_io_len() to use
>> blk_max_size_offset()" my out-of-tree driver stopped to work.
>> The reason is that now ti->max_io_len from such target is ignored:
>> max_io_len() ignores ti->max_io_len, while
>> dm_calculate_queue_limits() never stacks ti->chunk_sectors into
>> ti_limits.chunk_sectors.
>>
>> Here I see two possible solutions, both are in dm_calculate_queue_limits():
>>
>> 1)Move ti_limits.chunk_sectors assignment down, so it will be made
>> right under combine_limits label. Thus, every time ti->max_io_len
>> will transform into chunk_sectors, even in case of device
>> has no .iterate_devices method;
>>
>> 2)Move io_hints call under the label (like it's made in this patch),
>> so one can set desired chunk_sectors there.
>>
>> First solution looks less clear, since in two drivers chunk_sectors
>> are assigned in io_hints (see unstripe_io_hints() and dmz_io_hints()),
>> and this rewrites, and we should not rewrite it.
>>
>> Second solution does not break anything since we change only
>> order with ->iterate_devices(device_area_is_invalid), while
>> device_area_is_invalid never touches chunk_sectors. So I choosed it.
>>
>> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
>> ---
>>  drivers/md/dm-table.c |    8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
>> index 2073ee8d18f4..9994c767dc82 100644
>> --- a/drivers/md/dm-table.c
>> +++ b/drivers/md/dm-table.c
>> @@ -1453,10 +1453,6 @@ int dm_calculate_queue_limits(struct dm_table *table,
>>  		if (ti->max_io_len)
>>  			ti_limits.chunk_sectors = lcm_not_zero(ti->max_io_len,
>>  							       ti_limits.chunk_sectors);
>> -		/* Set I/O hints portion of queue limits */
>> -		if (ti->type->io_hints)
>> -			ti->type->io_hints(ti, &ti_limits);
>> -
>>  		/*
>>  		 * Check each device area is consistent with the target's
>>  		 * overall queue limits.
>> @@ -1466,6 +1462,10 @@ int dm_calculate_queue_limits(struct dm_table *table,
>>  			return -EINVAL;
>>  
>>  combine_limits:
>> +		/* Set I/O hints portion of queue limits */
>> +		if (ti->type->io_hints)
>> +			ti->type->io_hints(ti, &ti_limits);
>> +
>>  		/*
>>  		 * Merge this target's queue limits into the overall limits
>>  		 * for the table.
>>
>>
> 
> Sorry for the slow response, just got back from PTO today.
> 
> So while I appreciate that commit 5091cdec56fa caused your DM target to
> regress I don't think the proper solution is for your driver to take on
> setting chunk_sectors based on the ti->max_io_len you've established.
> 
> The use of chunk_sectors is an implementation detail.  One that every DM
> target that doesn't establish .iterate_devices should not need to take
> on worrying about.
> 
> Your above proposed patch changes DM core to _start_ to fix the
> regression you've reported but still requires your DM target to change
> too.  Does you DM target have one or more data device(s)?  If so you
> really should have it provide a .iterate_devices hook.  But that aside,

This is a virtual device, which never interact with data device, so there
is no .iterate_devices.

> here is the fix I'll be staging in linux-next shortly and that I'll be
> sending to Linus by the end of the week:
> 
> From: Mike Snitzer <snitzer at redhat.com>
> Date: Mon, 30 Nov 2020 10:57:43 -0500
> Subject: [PATCH] dm table: fix IO splitting
> 
> Commit 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account
> for target-specific splitting") caused a regression for a couple
> reasons:
> 
> 1) Using lcm_not_zero() instead of min_not_zero() when stacking
>    chunk_sectors was a bug because chunk_sectors must reflect the most
>    limited of all devices in the IO stack.
> 2) DM targets that set max_io_len but that do _not_ provide an
>    .iterate_devices method no longer had there IO split properly.
> 
> dm_calculate_queue_limits() must establish the appropriate IO
> splitting limits early, without any dependency on iterating
> data_devices, otherwise IO will not be split as intended.
> 
> Fixes: 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting")
> Cc: stable at vger.kernel.org
> Reported-by: John Dorminy <jdorminy at redhat.com>
> Reported-by: Bruce Johnston <bjohnsto at redhat.com>
> Reported-by: Kirill Tkhai <ktkhai at virtuozzo.com>
> Signed-off-by: Mike Snitzer <snitzer at redhat.com>

This works for me. Thanks, Mike.

You may add my Tested-by: Kirill Tkhai <ktkhai at virtuozzo.com>

> ---
>  drivers/md/dm-table.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index 2073ee8d18f4..4e58f5c68ac0 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -18,7 +18,6 @@
>  #include <linux/mutex.h>
>  #include <linux/delay.h>
>  #include <linux/atomic.h>
> -#include <linux/lcm.h>
>  #include <linux/blk-mq.h>
>  #include <linux/mount.h>
>  #include <linux/dax.h>
> @@ -1431,6 +1430,10 @@ int dm_calculate_queue_limits(struct dm_table *table,
>  
>  		ti = dm_table_get_target(table, i);
>  
> +		/* Set appropriate limits if target-specific splitting is required */
> +		if (ti->max_io_len)
> +			ti_limits.chunk_sectors = ti->max_io_len;
> +
>  		if (!ti->type->iterate_devices)
>  			goto combine_limits;
>  
> @@ -1449,10 +1452,6 @@ int dm_calculate_queue_limits(struct dm_table *table,
>  			zone_sectors = ti_limits.chunk_sectors;
>  		}
>  
> -		/* Stack chunk_sectors if target-specific splitting is required */
> -		if (ti->max_io_len)
> -			ti_limits.chunk_sectors = lcm_not_zero(ti->max_io_len,
> -							       ti_limits.chunk_sectors);
>  		/* Set I/O hints portion of queue limits */
>  		if (ti->type->io_hints)
>  			ti->type->io_hints(ti, &ti_limits);
> 




More information about the dm-devel mailing list