Timing help sought.. I think<br><br>We have been running on an iscsi mpath setup for about 1.5 years ... (no real failover other than testing) <br>Here is the HW we are dealing with:<br>- Equallogic ps disk array dual controller modules
<br>- qlogic 4052 HBA <br>- RHEL4.5<br><br>During testing phase things worked .. if i pulled power to a switch things moved over to the other .. but when <br>other switch came back .. no 'failback' occurred.. I was not too concerned about this as the initial failure worked and
<br>oracle kept going etc.. (if this happened in real life i figured we would obviously replace switch and reboot boxes when things <br>were back) .. <br><br>The orig switch setup did not incorporate a trunk as expected by the equallogic (as we now know) .. This was our error and reason for the fail back
<br>to not happen (im thinking). By now everything is in production and we discover this on a routine (during scheduled maintenance) fw <br>update of the switches which requires a reboot. <br><br>One week later (during maintenance again) we have the trunk in place between our iscsi switches and spanning-tree working on the
<br>switches (iscsi san looks like a square with two sets of switches with 1g fiber connections on one set of parallel lines)<br><br>My issue is this .. I am now seeing many path 'failures' like below .. but these are not really failures.. as it comes back
<br>in less than 2 seconds.. It seems no real I/O is affected at all.<br><br>Is this due to a setting in my defaults section of multipath.conf? I'm thinking minimum io or polling interval. Links all show<br>good on the switches and minimal errors (if any).
<br><br><br>
====== snip from /var/log/messages ===========<br>Sep 27 09:25:45 host kernel: SCSI error : <2 0 3 0> return code = 0x20000<br>Sep 27 09:25:45 host kernel: end_request: I/O error, dev sde, sector 161085656<br>Sep 27 09:25:45 host kernel: device-mapper: dm-multipath: Failing path 8:64.
<br>Sep 27 09:25:45 host kernel: end_request: I/O error, dev sde, sector 161085664<br>Sep 27 09:25:45 host kernel: SCSI error : <2 0 3 0> return code = 0x20000<br>Sep 27 09:25:45 host kernel: end_request: I/O error, dev sde, sector 119577336
<br>Sep 27 09:25:45 host kernel: end_request: I/O error, dev sde, sector 119577344<br>Sep 27 09:25:45 host kernel: SCSI error : <2 0 3 0> return code = 0x20000<br>Sep 27 09:25:45 host kernel: end_request: I/O error, dev sde, sector 233247600
<br>Sep 27 09:25:45 host kernel: end_request: I/O error, dev sde, sector 233247608<br>Sep 27 09:25:45 host multipathd: 8:64: mark as failed<br>Sep 27 09:25:45 host multipathd: host.datafiles.prod: remaining active paths: 1
<br>Sep 27 09:25:47 host multipathd: 8:64: readsector0 checker reports path is up<br>Sep 27 09:25:47 host multipathd: 8:64: reinstated<br>Sep 27 09:25:47 host multipathd: host.datafiles.prod: remaining active paths: 2<br>
Sep 27 09:25:47 host multipathd: host.datafiles.prod: switch to path group #1<br>Sep 27 09:25:47 host multipathd: host.datafiles.prod: switch to path group #1<br clear="all"> ========= end snip =========================<br>
<br>========= /etc/multipath.conf ================<br>defaults {<br> multipath_tool "/sbin/multipath -v0"<br> udev_dir /dev<br> polling_interval 2<br> selector "round-robin 0"
<br> path_grouping_policy failover<br> getuid_callout "/sbin/scsi_id -g -u -s /block/%n"<br> path_checker readsector0<br> prio_callout "/bin/true"
<br> features "0"<br> rr_min_io 2<br> rr_weight priorities<br> failback immediate<br> no_path_retry fail<br>
user_friendly_name yes<br>}<br>## everything is friendly names and ignore devices below<br>=========== end ======================<br><br><br>-- <br>:wq!<br>kevin.foote