[dm-devel] [PATCH 2/2] scsi : fixing the new host byte settings (DID_TARGET_FAILURE and DID_NEXUS_FAILURE)
Mike Snitzer
snitzer at redhat.com
Tue Jan 24 23:03:20 UTC 2012
Hi Babu,
Thanks for finding this.
On Tue, Jan 24 2012 at 3:38pm -0500,
Moger, Babu <Babu.Moger at netapp.com> wrote:
> Resubmitting as my previous post had format issues and did not go linux-scsi.
>
> This patch fixes the host byte settings DID_TARGET_FAILURE and DID_NEXUS_FAILURE.
> The function __scsi_error_from_host_byte, tries to reset the host byte to DID_OK. But that
> does not happen because of the OR operation.
>
> Here is the flow.
> scsi_softirq_done-> scsi_decide_disposition -> __scsi_error_from_host_byte
or more accurately:
scsi_softirq_done -> scsi_decide_disposition
scsi_softirq_done -> scsi_finish_command -> scsi_io_completion -> __scsi_error_from_host_byte
> Let's take an example with DID_NEXUS_FAILURE. In scsi_decide_disposition, result will be set as
> DID_NEXUS_FAILURE (=0x11). Then in __scsi_error_from_host_byte, when we do OR with
> DID_OK. Purpose is to reset it back to DID_OK. But that does not happen. This patch fixes this issue.
We clearly aren't properly resetting to DID_OK but I'm not seeing an
obvious "nasty bug" that is lurking due to this. Am I missing
something?
__scsi_error_from_host_byte() is setting error which is passed back up
via blk_end_request() and blk_end_request_all(). And in my previous
testing I know that corresponding errors are making it out to
dm-multipath (e.g. -EREMOTEIO).
Also, your patch header is missing the location where DID_OK is not
properly matched (because it wasn't set exclussively due to being
or'd). Looks like scsi_noretry_cmd() will be made more efficient
because it will match DID_OK immediately. Any other locations? Would
be good to call them out.
Mike
More information about the dm-devel
mailing list