[dm-devel] [PATCH] dm-mpath: push back requests instead of queueing
Hannes Reinecke
hare at suse.de
Tue Nov 12 08:49:16 UTC 2013
On 11/12/2013 09:17 AM, Hannes Reinecke wrote:
> On 11/12/2013 08:48 AM, Junichi Nomura wrote:
>> On 11/08/13 18:02, Hannes Reinecke wrote:
>>> There is no reason why multipath needs to queue requests
>>> internally for queue_if_no_path or pg_init; we should
>>> rather push them back onto the request queue.
>>>
>>>
>>> This patch removes the internal queueing mechanism from dm-multipath.
>>
>> Hi Hannes,
>> thanks for working on this.
>>
>>> In doing so we can also remove the ->busy check as a requeue is identical
>>> to ->busy returning 'true' from the callers perspective. This simplifies
>>> the code by quite a lot.
>>
>> They are not identical.
>> ->busy returns true when underlying path cannot dispatch a request
>> immediately. In that case it's better to keep the request in queue
>> to allow merges. (It used to have real impact on buffered sequential
>> write + fsync workload, though the performance impact might become
>> smaller in recent kernels due to plugging change.)
>> Also ->busy can be used by upper layer (dm_table_any_busy_target).
>> So you can't just remove it.
>>
> But they are identical from the callers perspective:
> drivers/md/dm.c:dm_request_fn()
>
> if (ti->type->busy && ti->type->busy(ti))
> goto delay_and_out;
>
> clone = dm_start_request(md, rq);
>
> spin_unlock(q->queue_lock);
> if (map_request(ti, clone, md))
> goto requeued;
>
> BUG_ON(!irqs_disabled());
> spin_lock(q->queue_lock);
> }
>
> goto out;
>
> requeued:
> BUG_ON(!irqs_disabled());
> spin_lock(q->queue_lock);
>
> delay_and_out:
> blk_delay_queue(q, HZ / 10);
>
> So the only difference between ->busy and return 'DM_MAPIO_REQUEUE'
> is that 'busy' runs under the queue_lock.
>
> And yes, in theory ->busy might be used from any upper layer; but
> so far the only caller I've found _is_ in dm_request_fn().
> So removing doesn't do any harm.
>
> Unless I've misread something ...
>
>>> @@ -385,21 +378,15 @@ static int map_io(struct multipath *m, struct request *clone,
>>> (!m->queue_io && (m->repeat_count && --m->repeat_count == 0)))
>>> __choose_pgpath(m, nr_bytes);
>>>
>>> - pgpath = m->current_pgpath;
>>> -
>>> - if (was_queued)
>>> - m->queue_size--;
>>> + if (m->current_pgpath &&
>>> + m->pg_init_required && !m->pg_init_in_progress)
>>> + __pg_init_all_paths(m);
>>>
>>> + pgpath = m->current_pgpath;
>>> if ((pgpath && m->queue_io) ||
>>> (!pgpath && m->queue_if_no_path)) {
>>> - /* Queue for the daemon to resubmit */
>>> - list_add_tail(&clone->queuelist, &m->queued_ios);
>>> - m->queue_size++;
>>> - if ((m->pg_init_required && !m->pg_init_in_progress) ||
>>> - !m->queue_io)
>>> - queue_work(kmultipathd, &m->process_queued_ios);
>>> pgpath = NULL;
>>> - r = DM_MAPIO_SUBMITTED;
>>> + r = DM_MAPIO_REQUEUE;
>>
>> if/else sequence in map_io() might be easier to read if we do:
>>
>> if (pgpath) {
>> if (pg_ready(m)) {
>> ... // remap
>> r = DM_MAPIO_REMAPPED;
>> } else {
>> __pg_init_all_paths(m);
>> r = DM_MAPIO_REQUEUE;
>> }
>> } else { /* no path */
>> if (need_requeue(m))
>> r = DM_MAPIO_REQUEUE;
>> else
>> r = -EIO;
>> }
>>
>> where:
>> #define pg_ready(m) (!m->queue_io) /* or rename 'queue_io' */
>> #define need_requeue(m) (m->queue_if_no_path || __must_push_back(m))
>>
>> and move pg_init_required, etc. checks to __pg_init_all_paths().
>>
> True. Will be doing that.
>
>>> @@ -1241,7 +1168,8 @@ static void pg_init_done(void *data, int errors)
>>> m->queue_io = 0;
>>>
>>> m->pg_init_delay_retry = delay_retry;
>>> - queue_work(kmultipathd, &m->process_queued_ios);
>>> + if (!m->queue_io)
>>> + dm_table_run_queue(m->ti->table);
>>>
>>> /*
>>> * Wake up any thread waiting to suspend.
>>
>> process_queued_ios was responsible for retrying pg_init.
>> And when retrying, m->queue_io is still 0.
>> So don't we have to run queue unconditionally here
>> or call __pg_init_all_paths() directly?
>>
> In my rework I've _tried_ to separate both functions from
> process_queued_ios().
> But yes, you are right; I haven't considered pg_init_retry.
> Will be updating the patch.
>
Actually, I have. I just had a closer look at the patch,
and pg_init retry is handled, albeit differently than in
the original.
It now works like this:
->map_io() is called
-> calls '__switch_pg', which sets 'queue_io'
-> calls __pg_init_all_paths, which pushes activate_path
onto a workqueue
-> returns 'MAPIO_REQUEUE'
-> pg_init_done()
-> Checks pg_init_required
-> if non-zero some other I/O already
kicked off an 'activate_path',
so nothing to be done from our side
-> if zero we're calling kicking the queue
via blk_run_queue
And blk_run_queue() will be calling into ->request_fn,
which will pull requests off the queue.
So on the next request we're calling 'map_io', so the
entire game starts anew, retrying pg_init.
The only thing we're not handling properly is the
'pg_init_delay_retry', as for that we should've
started the queue with a certain delay, which
we currently don't. But that's easily fixable.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare at suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
More information about the dm-devel
mailing list