[Linux-cluster] SAN + multipathd + GFS : SCSI error

Simone Gotti simone.gotti at email.it
Fri Aug 10 14:14:22 UTC 2007


Hi,

I saw various machines with Qlogic HBAs having this issue (error code
0x20000 is DID_BUS_BUSY), in my case when using device mapper multipath,
the path getting the error was failed by dm-multipath and then reactived
because the path checker reported it was up (as it was transient error).

It looks like a wrong qla2xxx behavior as reported in this knowledge
base:  http://kbase.redhat.com/faq/FAQ_46_9001.shtm
and also in bug
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=231319
where there's a proposed fix for RHEL4 U6.

I tested the workaround proposed in the kbase in a test environment
where unfortunately this issue wasn't present and I simulated it forcing
an HBA lip with sysfs but with this test the problem didn't disappeared.

Maybe your issue is the same.

Bye!

On Fri, 2007-08-10 at 09:58 -0400, FM wrote:
> Hello,
> All servers are RHEL 4.5
> SAN is HP EVA 4000
> we are using linux qla modules and multipathd
> cluster server have only one FC Card
> 
> 
> In the dmesg of servers connected to GFS we have a lot of :
> SCSI error : <0 0 1 1> return code = 0x20000
> end_request: I/O error, dev sdd, sector 37807111
> 
> The cluster seems to work fine but I'd like to know if we can avoid this
> error.
> 
> here is a multipathd -ll output :
> 
> [root at como ~]# multipath -ll
> mpath1 (3600508b4001051e40000900000310000)
> [size=500 GB][features="1 queue_if_no_path"][hwhandler="0"]
> \_ round-robin 0 [prio=50][active]
>  \_ 0:0:0:1 sda 8:0  [active][ready]
> \_ round-robin 0 [prio=10][enabled]
>  \_ 0:0:1:1 sdd 8:48 [active][ready]
> 
> mpath3 (3600508b4001051e400009000009e0000)
> [size=150 GB][features="1 queue_if_no_path"][hwhandler="0"]
> \_ round-robin 0 [prio=50][active]
>  \_ 0:0:1:2 sde 8:64 [active][ready]
> \_ round-robin 0 [prio=10][enabled]
>  \_ 0:0:0:2 sdb 8:16 [active][ready]
> 
> 
> 
> and the device in the multipath.conf
> 
> devices {
>         device {
>                 vendor                  "HP "
>                 product                 "HSV200 "
>                 path_grouping_policy    group_by_prio
>                 getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
>                 path_checker            tur
>                 path_selector           "round-robin 0"
>                 prio_callout            "/sbin/mpath_prio_alua %d"
>                 failback                immediate
>                 no_path_retry           60
>         }
> }
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-- 
Simone Gotti
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070810/cdc6b317/attachment.sig>


More information about the Linux-cluster mailing list