[dm-devel] [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

Laurence Oberman loberman at redhat.com
Thu Jan 18 21:45:15 UTC 2018


Hello Bart

Firstly let me start with : You have always been kind, patient and helpful
to me and myself the same to you so I am not keen to get in the middle of
this.

But its not true about Red Hat because I work very hard on this and I very
often find bugs you are not seeing so Red Hat is adding value here.
I emailed you a number of times asking if you can provide me the exact
steps, but not via your srp-test suite.

I have a setup that is not conducive to running your loop disconnects etc.
and if you are seeing a stall on multiple loops of 02-mq I should be able
to reproduce it with out having to run your test suite.

Please let me know how I can help

Laurence

On Thu, Jan 18, 2018 at 4:39 PM, Bart Van Assche <Bart.VanAssche at wdc.com>
wrote:

> On Thu, 2018-01-18 at 16:23 -0500, Mike Snitzer wrote:
> > On Thu, Jan 18 2018 at  3:58P -0500,
> > Bart Van Assche <Bart.VanAssche at wdc.com> wrote:
> >
> > > On Thu, 2018-01-18 at 15:48 -0500, Mike Snitzer wrote:
> > > > For Bart's test the underlying scsi-mq driver is what is regularly
> > > > hitting this case in __blk_mq_try_issue_directly():
> > > >
> > > >         if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))
> > >
> > > These lockups were all triggered by incorrect handling of
> > > .queue_rq() returning BLK_STS_RESOURCE.
> >
> > Please be precise, dm_mq_queue_rq()'s return of BLK_STS_RESOURCE?
> > "Incorrect" because it no longer runs blk_mq_delay_run_hw_queue()?
>
> In what I wrote I was referring to both dm_mq_queue_rq() and
> scsi_queue_rq().
> With "incorrect" I meant that queue lockups are introduced that make user
> space processes unkillable. That's a severe bug.
>
> > Please try to do more work analyzing the test case that only you can
> > easily run (due to srp_test being a PITA).
>
> It is not correct that I'm the only one who is able to run that software.
> Anyone who is willing to merge the latest SRP initiator and target driver
> patches in his or her tree can run that software in
> any VM. I'm working hard
> on getting the patches upstream that make it possible to run the srp-test
> software on a setup that is not equipped with InfiniBand hardware.
>
> > We have time to get this right, please stop hyperventilating about
> > "regressions".
>
> Sorry Mike but that's something I consider as an unfair comment. If Ming
> and
> you work on patches together, it's your job to make sure that no
> regressions
> are introduced. Instead of blaming me because I report these regressions
> you
> should be grateful that I take the time and effort to report these
> regressions
> early. And since you are employed by a large organization that sells Linux
> support services, your employer should invest in developing test cases that
> reach a higher coverage of the dm, SCSI and block layer code. I don't think
> that it's normal that my tests discovered several issues that were not
> discovered by Red Hat's internal test suite. That's something Red Hat has
> to
> address.
>
> Bart.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20180118/e7febf5e/attachment.htm>


More information about the dm-devel mailing list