[dm-devel] kernel oops with blk-mq-sched latest

Wed Jan 18 14:40:06 UTC 2017

On 01/18/2017 03:48 AM, Hannes Reinecke wrote:
> Nearly there.
> You're missing a 'blk_mq_start_hw_queues(q)' after
> blk_mq_unfreeze_queue(); without it the queue will stall after switching
> the scheduler.

Yes indeed, forgot that. Needed after the quiesce.

> Also what's quite suspicious is this:
> 
> struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
> 				    struct request_queue *q)
> {
> 	struct blkcg_gq *blkg;
> 
> 	WARN_ON_ONCE(!rcu_read_lock_held());
> 	lockdep_assert_held(q->queue_lock);
> 
> 	/*
> 	 * This could be the first entry point of blkcg implementation and
> 	 * we shouldn't allow anything to go through for a bypassing queue.
> 	 */
> 	if (unlikely(blk_queue_bypass(q)))
> 		return ERR_PTR(blk_queue_dying(q) ? -ENODEV : -EBUSY);
> 
> which now won't work as the respective flags aren't set anymore.
> Not sure if that's a problem, though.
> But you might want to look at that, too.

dying is still used on blk-mq, but yes, the bypass check should now be
frozen for blk-mq. Not really directly related to the above change,
but it should be fixed up.

> Nevertheless, with the mentioned modifications to your patch the crashes
> don't occur anymore.

Great

> Sad news is that it doesn't help _that_ much on spinning rust mpt3sas;
> there I still see a ~50% performance penalty on reads.
> Write's slightly better than sq performance, though.

What is the test case? Full details please, from hardware to what you
are running. As I've mentioned before, I don't necessarily think your
performance issues are related to scheduling. Would be nice to get
to the bottom of it, though. And for that, I need more details.

-- 
Jens Axboe