md Multipath problem with qla2300

Tyler Shaw jtylershaw at hotmail.com
Thu Apr 28 17:57:56 UTC 2005


Hello all,

I am running RHEL AS 4.0 with dual qla2342's on a SunFire 40z using the 
default qla2300 drivers that come with RHEL 4.  I set up an md multipath 
array which works beautifully.  During testing, I have been removing a 
single path to test the failover capabilities by shutting down the port on 
the switch.  Mdadm detectst that the path is down and marks the device as 
faulty.  I get a nice e-mail saying that mdadm detected it as down.

The problem is that right after mdadm has marked the device as faulty, I 
re-enable the port and wait to see if it comes back.  I have been waiting 
for 30 minutes now and it's still marked as down.  mdadm --monitor --scan -f 
and mdmpd are both running and neither are picking up the fact that the 
device is back.  Previously I have been able to run mdadm --examine <device> 
and have it immediately add the device back in as active.  I'm not sure 
where the failing is here.

Mdmpd is supposed to look at /proc/mdstat, but /proc/mdstat continually 
shows the device as faulty.  I don't know if the mdadm --monitor has the 
functionality of mdmpd to detect a device coming back.  This same 
configuration was working under RHEL 3 (but with qlogic drivers), so I'm 
wondering if that is where the problem lies, that the RedHat supplied 
drivers are not reporting to the kernel that the device is back.

Does anybody have an idea what's going on and how I can get the default 
programs to work without having to run my own custom scripts in the event of 
a failure?

Thanks,

Tyler





More information about the redhat-list mailing list