[dm-devel] Re: multipath-tools support for SGI TP9700...
Kevin M Lange
kevin_m_lange at raytheon.com
Thu Jun 26 04:49:18 UTC 2008
Christophe,
Unfortunately it does not appear that the TP9700 is working using the
multipath device settings you provided.
Our configuration is such where the host (a Sun X4600 running RHEL 5.2) is
connected to the TP9700 using two Fibrechannel connections:
No FC switches are used, just simple direct HBA to SP connectivity with
two HBAs and two Storage Processors. LUNs on the RAID are distributed to
be owned by either SPA or SPB to distribute the workload between the SPs
and the fibrechannel connections.
The TP9700 can be configured to present the storage to a host by setting
the "Storage Array Host Type" (Linux, SGIRDAC, SGIAVT, Windows, etc). For
my tests, I've been experimenting with Linux and SGIRDAC. I have been
unsuccessful in determining what the storage array host type "Linux"s
failover method is, but I thought I had come across an article that said
the Linux type is basic AVT. I could be mistaken.
Setting the TP9700 Host Type to "Linux" , I then setup /etc/multipath.conf
to mimic the defaults for the TP9500:
device {
vendor "SGI"
product "TP9[457]00"
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
prio_callout "/sbin/mpath_prio_tpc /dev/%n"
features "0"
hardware_handler "0"
path_grouping_policy group_by_prio
failback immediate
rr_weight uniform
rr_min_io 1000
path_checker tur
}
This configuration ran OK for a while, then began to log multipath
failures, and eventually I/O buffer errors. All LUNs on one SP trespassed
to the other SP, and I had to manually place each trespassed LUN back to
its primary path.
Changing the TP9700 host type to SGIRDAC, then trying the configuration
you provided me caused the host to not see the ghost path. Effectively I
ended up with a single path. Disconnecting a FC connection resulted in
the inability to see any of the LUNs assigned to the associated SP.
I modified the multipath.conf a little:
device {
vendor "SGI"
product "TP9700"
path_grouping_policy failover
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
features "1 queue_if_no_path"
path_checker rdac
prio_callout "/sbin/mpath_prio_tpc /dev/%n"
hardware_handler "1 rdac"
prio rdac
failback immediate
}
This worked ok, but I see lots of scsi sense key errors:
Jun 23 12:16:42 p4dbl03 kernel: sdbk: Current: sense key: Recovered Error
Jun 23 12:16:42 p4dbl03 kernel: <<vendor>> ASC=0x95 ASCQ=0x1ASC=0x95
ASCQ=0x1
Jun 23 12:16:42 p4dbl03 kernel:
I see those error regardless of how I configured the RAID and
multipath.conf, which is worrisome.
I especially see those errors if I run 'fdisk -l'.
Disconnecting a FC cable on one HBA caused the associated volumes to
trespass to the other SP, however, during this process, I noticed buffer
I/O errors. Also, I noticed that the trespassed LUNs did not failback to
their original SP when the FC cable was reconnected. Am I to assume that
RDAC or other multipath software will not tell the storage to failback
trepassed LUNs?
Your assistance is appreciated,
- Kevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20080626/46d3d5b6/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 14003 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20080626/46d3d5b6/attachment.jpe>
More information about the dm-devel
mailing list