[dm-devel] [PATCH] dm-mpath: push back requests instead of queueing

Junichi Nomura j-nomura at ce.jp.nec.com
Tue Nov 12 10:00:52 UTC 2013


On 11/12/13 18:05, Hannes Reinecke wrote:
> On 11/12/2013 09:43 AM, Junichi Nomura wrote:
>> On 11/12/13 17:17, Hannes Reinecke wrote:
>>> On 11/12/2013 08:48 AM, Junichi Nomura wrote:
>>>> On 11/08/13 18:02, Hannes Reinecke wrote:
>>>>> In doing so we can also remove the ->busy check as a requeue is identical
>>>>> to ->busy returning 'true' from the callers perspective. This simplifies
>>>>> the code by quite a lot.
>>>>
>>>> They are not identical.
>>>> ->busy returns true when underlying path cannot dispatch a request
>>>> immediately. In that case it's better to keep the request in queue
>>>> to allow merges.  (It used to have real impact on buffered sequential
>>>> write + fsync workload, though the performance impact might become
>>>> smaller in recent kernels due to plugging change.)
>>>> Also ->busy can be used by upper layer (dm_table_any_busy_target).
>>>> So you can't just remove it.
>>>>
>>> But they are identical from the callers perspective:
>>> drivers/md/dm.c:dm_request_fn()
>>>
>>> 		if (ti->type->busy && ti->type->busy(ti))
>>> 			goto delay_and_out;
>>>
>>> 		clone = dm_start_request(md, rq);
>>>
>>> 		spin_unlock(q->queue_lock);
>>> 		if (map_request(ti, clone, md))
>>> 			goto requeued;
>>>
>>> 		BUG_ON(!irqs_disabled());
>>> 		spin_lock(q->queue_lock);
>>> 	}
>>>
>>> 	goto out;
>>>
>>> requeued:
>>> 	BUG_ON(!irqs_disabled());
>>> 	spin_lock(q->queue_lock);
>>>
>>> delay_and_out:
>>> 	blk_delay_queue(q, HZ / 10);
>>>
>>> So the only difference between ->busy and return 'DM_MAPIO_REQUEUE'
>>> is that 'busy' runs under the queue_lock.
>>
>> A difference is whether the request is once dequeued or not.
>> I think requeued request does not accept any merge.
>>
> Hmm. Now _that_ is a valid point.
> But I would've thought all possible merges are already done by
> the time request_fn() is called.
> If not multipathing would be working suboptimal anyway, as a
> potential merge has been missed even during normal I/O.
> 
>> But the point is not there; if you remove ->busy, nobody checks whether
>> underlying device is busy and DM_MAPIO_REQUEUE won't be returned.
>> So the other option might be moving ->busy check in request_fn to
>> inside of map function and let it return DM_MAPIO_REQUEUE.
>>
> ???
> 
> But that's precisely what the patch is doing, no?
> The only thing we're not doing in map_io() is to check for
> pgpath_busy(), but that would be pointless as a busy pgpath
> would return DM_MAPIO_REQUEUE anyway.

No.  As you've removed __pgpath_busy(), nobody calls
dm_underlying_device_busy(), which calls scsi_lld_busy().

> We _could_ optimize this in __switch_pg(), to call pgpath_busy()
> when selecting the paths. But that should be done by the path selector.
> So for that we would need to separate the functionality of the
> path selector and __switch_pg; currently it's unclear whether
> we need to call pgpath_busy() in __switch_pg or not.

There is no need to call pgpath_busy in __switch_pg.

If we call __pgpath_busy() in map function, I think it's here:

	if (pgpath) {
+ 		if (__pgpath_busy(pgpath))
+ 			r = DM_MAPIO_REQUEUE;
		else if (pg_ready(m)) {
			... // remap
			r = DM_MAPIO_REMAPPED;
		} else {
			__pg_init_all_paths(m);
			r = DM_MAPIO_REQUEUE;
		}
	...

or in a path selector.

> The main reason for removing 'busy' is that it's really hard
> to do right for the queue_if_no_path case.
> We can easily do a ->busy check during pg_init (by just checking
> ->queue_io), but the queue_if_no_path _condition_ is only
> established after we've called __switch_pg(). Which is precisely
> what we do _not_ want here.
> Maybe we should add a flag here; 'all_paths_down' or something.
> That could be easily checked from ->busy. Plus it can only be
> unset on 'reinstate_path'. Hmm.

-- 
Jun'ichi Nomura, NEC Corporation




More information about the dm-devel mailing list