[dm-devel] [PATCH v2] block: use gcd() to fix chunk_sectors limit stacking

Damien Le Moal Damien.LeMoal at wdc.com
Fri Dec 4 06:22:06 UTC 2020


On 2020/12/04 11:11, Mike Snitzer wrote:
> On Thu, Dec 03 2020 at  8:45pm -0500,
> Ming Lei <ming.lei at redhat.com> wrote:
> 
>> On Thu, Dec 03, 2020 at 08:27:38AM -0800, Keith Busch wrote:
>>> On Thu, Dec 03, 2020 at 09:33:59AM -0500, Mike Snitzer wrote:
>>>> On Wed, Dec 02 2020 at 10:26pm -0500,
>>>> Ming Lei <ming.lei at redhat.com> wrote:
>>>>
>>>>> I understand it isn't related with correctness, because the underlying
>>>>> queue can split by its own chunk_sectors limit further. So is the issue
>>>>> too many further-splitting on queue with chunk_sectors 8? then CPU
>>>>> utilization is increased? Or other issue?
>>>>
>>>> No, this is all about correctness.
>>>>
>>>> Seems you're confining the definition of the possible stacking so that
>>>> the top-level device isn't allowed to have its own hard requirements on
>>>> IO sizes it sends to its internal implementation.  Just because the
>>>> underlying device can split further doesn't mean that the top-level
>>>> virtual driver can service larger IO sizes (not if the chunk_sectors
>>>> stacking throws away the hint the virtual driver provided because it
>>>> used lcm_not_zero).
>>>
>>> I may be missing something obvious here, but if the lower layers split
>>> to their desired boundary already, why does this limit need to stack?
>>> Won't it also work if each layer sets their desired chunk_sectors
>>> without considering their lower layers? The commit that initially
>>> stacked chunk_sectors doesn't provide any explanation.
>>
>> There could be several reasons:
>>
>> 1) some limits have to be stacking, such as logical block size, because
>> lower layering may not handle un-aligned IO
>>
>> 2) performance reason, if every limits are stacked on topmost layer, in
>> theory IO just needs to be splitted in top layer, and not need to be
>> splitted further from all lower layer at all. But there should be exceptions
>> in unusual case, such as, lowering queue's limit changed after the stacking
>> limits are setup.
>>
>> 3) history reason, bio splitting is much younger than stacking queue
>> limits.
>>
>> Maybe others?
> 
> Hannes didn't actually justify why he added chunk_sectors to
> blk_stack_limits:
> 
> commit 987b3b26eb7b19960160505faf9b2f50ae77e14d
> Author: Hannes Reinecke <hare at suse.de>
> Date:   Tue Oct 18 15:40:31 2016 +0900
> 
>     block: update chunk_sectors in blk_stack_limits()
> 
>     Signed-off-by: Hannes Reinecke <hare at suse.com>
>     Signed-off-by: Damien Le Moal <damien.lemoal at hgst.com>
>     Reviewed-by: Christoph Hellwig <hch at lst.de>
>     Reviewed-by: Martin K. Petersen <martin.petersen at oracle.com>
>     Reviewed-by: Shaun Tancheff <shaun.tancheff at seagate.com>
>     Tested-by: Shaun Tancheff <shaun.tancheff at seagate.com>
>     Signed-off-by: Jens Axboe <axboe at fb.com>
> 
> Likely felt it needed for zoned or NVMe devices.. dunno.

For zoned drives, chunk_sectors indicates the zone size so the stacking
propagates that value to the upper layer, if said layer is also zoned. If it is
not zoned (e.g. dm-zoned device), chunk_sectors can actually be 0: it would be
the responsibility of that layer to not issue BIO that cross zone boundaries to
the lower zoned layer. Since all of this depends on the upper layer zoned model,
removing the stacking of chunk_sectors would be fine, as long as the target
initialization code sets it based on the drive model being exposed. E.g.:
* dm-linear on zoned dev will be zoned with the same zone size
* dm-zoned on zoned dev is not zoned, so chunk_sectors can be 0
* dm-linear on RAID volume can have chunk_sectors set to the underlying volume
chunk_sectors (stripe size), if dm-linear is aligned to stripes.
* etc.

> But given how we now have a model where block core, or DM core, will
> split as needed I don't think normalizing chunk_sectors (to the degree
> full use of blk_stack_limits does) and than using it as basis for
> splitting makes a lot of sense.

For zoned dev, I agree. DM-core can set chunk_sectors for the DM device based on
its zone model for DM driver that supports zones (dm-linear, dm-flakey and
dm-zoned).

-- 
Damien Le Moal
Western Digital Research






More information about the dm-devel mailing list