[dm-devel] [PATCH 1/5] blk-mq: introduce BLK_STS_DEV_RESOURCE
Bart Van Assche
Bart.VanAssche at wdc.com
Mon Jan 22 16:49:54 UTC 2018
On Mon, 2018-01-22 at 11:35 +0800, Ming Lei wrote:
> @@ -1280,10 +1282,18 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
> * - Some but not all block drivers stop a queue before
> * returning BLK_STS_RESOURCE. Two exceptions are scsi-mq
> * and dm-rq.
> + *
> + * If drivers return BLK_STS_RESOURCE and S_SCHED_RESTART
> + * bit is set, run queue after 10ms for avoiding IO hang
> + * because the queue may be idle and the RESTART mechanism
> + * can't work any more.
> */
> - if (!blk_mq_sched_needs_restart(hctx) ||
> + needs_restart = blk_mq_sched_needs_restart(hctx);
> + if (!needs_restart ||
> (no_tag && list_empty_careful(&hctx->dispatch_wait.entry)))
> blk_mq_run_hw_queue(hctx, true);
> + else if (needs_restart && (ret == BLK_STS_RESOURCE))
> + blk_mq_delay_run_hw_queue(hctx, 10);
> }
In my opinion there are two problems with the above changes:
* Only the block driver author can know what a good choice is for the time
after which to rerun the queue. So I think moving the rerun delay (10 ms)
constant from block drivers into the core is a step backwards instead of a
step forwards.
* The purpose of the BLK_MQ_S_SCHED_RESTART flag is to detect whether or not
any of the queue runs triggered by freeing a tag happened concurrently. I
don't think that there is any relationship between queue runs happening all
or not concurrently and the chance that driver resources become available.
So deciding whether or not a queue should be rerun based on the value of
the BLK_MQ_S_SCHED_RESTART flag seems wrong to me.
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index d9ca1dfab154..55be2550c555 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2030,9 +2030,9 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
> case BLK_STS_OK:
> break;
> case BLK_STS_RESOURCE:
> - if (atomic_read(&sdev->device_busy) == 0 &&
> - !scsi_device_blocked(sdev))
> - blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
> + if (atomic_read(&sdev->device_busy) ||
> + scsi_device_blocked(sdev))
> + ret = BLK_STS_DEV_RESOURCE;
> break;
> default:
> /*
The above introduces two changes that have not been mentioned in the
description of this patch:
- The queue rerunning delay is changed from 3 ms into 10 ms. Where is the
explanation of this change? Does this change have a positive or negative
performance impact?
- The above modifies a guaranteed queue rerun into a queue rerun that
may or may not happen, depending on whether or not multiple tags get freed
concurrently (return BLK_STS_DEV_RESOURCE). Sorry but I think that's wrong.
Bart.
More information about the dm-devel
mailing list