[dm-devel] fragmented i/o with 2.6.31?
Kiyoshi Ueda
k-ueda at ct.jp.nec.com
Thu Sep 17 08:02:39 UTC 2009
Hi David, Mike, Alasdair,
On 09/17/2009 01:22 AM +0900, David Strand wrote:
> On Wed, Sep 16, 2009 at 8:34 AM, David Strand <dpstrand at gmail.com> wrote:
>> I am issuing 512 Kbyte reads through the device mapper device node to
>> a fibre channel disk. With 2.6.30 one read command for the entire 512
>> Kbyte length is placed on the wire. With 2.6.31 this is being broken
>> up into 5 smaller read commands placed on the wire, decreasing
>> performance.
>>
>> This is especially penalizing on some disks where we have prefetch
>> turned off via the scsi mode page. Is there any easy way (through
>> configuration or sysfs) to restore the single read per i/o behavior
>> that I used to get?
>
> I should note that I am using dm-mpath, and the i/o is fragmented on
> the wire when using the device mapper device node but it is not
> fragmented when using one of the regular /dev/sd* device nodes for
> that device.
David,
Thank you for reporting this.
I found on my test machine that max_sectors is set to SAFE_MAX_SECTORS,
which limits the I/O size small.
The attached patch fixes it. I guess the patch (and increasing
read-ahead size in /sys/block/dm-<n>/queue/read_ahead_kb) will solve
your fragmentation issue. Please try it.
Mike, Alasdair,
I found that max_sectors and max_hw_sectors of dm device are set
in smaller values than those of underlying devices. E.g:
# cat /sys/block/sdj/queue/max_sectors_kb
512
# cat /sys/block/sdj/queue/max_hw_sectors_kb
32767
# echo "0 10 linear /dev/sdj 0" | dmsetup create test
# cat /sys/block/dm-0/queue/max_sectors_kb
127
# cat /sys/block/dm-0/queue/max_hw_sectors_kb
127
This prevents the I/O size of struct request from becoming enough big
size, and causes undesired request fragmentation in request-based dm.
This should be caused by the queue_limits stacking.
In dm_calculate_queue_limits(), the block-layer's small default size
is included in the merging process of target's queue_limits.
So underlying queue_limits is not propagated correctly.
I think initializing default values of all max_* in '0' is an easy fix.
Do you think my patch is acceptable?
Any other idea to fix this problem?
Signed-off-by: Kiyoshi Ueda <k-ueda at ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura at ce.jp.nec.com>
Cc: David Strand <dpstrand at gmail.com>
Cc: Mike Snitzer <snitzer at redhat.com>,
Cc: Alasdair G Kergon <agk at redhat.com>
---
drivers/md/dm-table.c | 4 ++++
1 file changed, 4 insertions(+)
Index: 2.6.31/drivers/md/dm-table.c
===================================================================
--- 2.6.31.orig/drivers/md/dm-table.c
+++ 2.6.31/drivers/md/dm-table.c
@@ -992,9 +992,13 @@ int dm_calculate_queue_limits(struct dm_
unsigned i = 0;
blk_set_default_limits(limits);
+ limits->max_sectors = 0;
+ limits->max_hw_sectors = 0;
while (i < dm_table_get_num_targets(table)) {
blk_set_default_limits(&ti_limits);
+ ti_limits.max_sectors = 0;
+ ti_limits.max_hw_sectors = 0;
ti = dm_table_get_target(table, i++);
More information about the dm-devel
mailing list