[dm-devel] multipathd segfault and SCSI errors
Prakash Rudraraju
prakash at jigsaw.com
Fri Oct 24 06:11:38 UTC 2008
Hi,
We have setup a Compellent SAN with 2 HBA attached to dual fabrics. Under load when we import a 60GB database, paths fail very often. Following is the failed path behavior from syslog.
Oct 23 02:01:15 db03 kernel: sd 2:0:1:1: SCSI error: return code = 0x08000002
Oct 23 02:01:15 db03 kernel: sde: Current: sense key: Aborted Command
Oct 23 02:01:15 db03 kernel: Add. Sense: Internal target failure
Oct 23 02:01:15 db03 kernel:
Oct 23 02:01:15 db03 kernel: end_request: I/O error, dev sde, sector 911585239
Oct 23 02:01:15 db03 kernel: device-mapper: multipath: Failing path 8:64.
Oct 23 02:01:15 db03 multipathd: 8:64: mark as failed
Oct 23 02:01:15 db03 multipathd: mpath1: remaining active paths: 1
Oct 23 02:01:15 db03 kernel: sd 1:0:3:1: SCSI error: return code = 0x08000002
Oct 23 02:01:15 db03 kernel: sdc: Current: sense key: Aborted Command
Oct 23 02:01:15 db03 kernel: Add. Sense: Internal target failure
Oct 23 02:01:15 db03 kernel:
Oct 23 02:01:15 db03 kernel: end_request: I/O error, dev sdc, sector 911585239
Oct 23 02:01:15 db03 kernel: device-mapper: multipath: Failing path 8:32.
Oct 23 02:01:16 db03 multipathd: 8:32: mark as failed
Oct 23 02:01:16 db03 multipathd: mpath1: remaining active paths: 0
Oct 23 02:01:19 db03 multipathd: sde: tur checker reports path is up
Oct 23 02:01:19 db03 multipathd: 8:64: reinstated
Oct 23 02:01:19 db03 multipathd: mpath1: remaining active paths: 1
Oct 23 02:01:20 db03 multipathd: sdc: tur checker reports path is up
Oct 23 02:01:20 db03 multipathd: 8:32: reinstated
Oct 23 02:01:20 db03 multipathd: mpath1: remaining active paths: 2
Oct 23 02:01:21 db03 kernel: sd 2:0:1:1: SCSI error: return code = 0x08000002
Oct 23 02:01:21 db03 kernel: sde: Current: sense key: Aborted Command
Oct 23 02:01:21 db03 kernel: Add. Sense: Internal target failure
Oct 23 02:01:21 db03 kernel:
Multipathd segfault during boot and following is from dmesg output:
multipathd[7165]: segfault at 000000000000000a rip 00002aaaaaf51a3d rsp 00007fff03b50090 error 4
sd 2:0:1:1: SCSI error: return code = 0x08000002
sde: Current: sense key: Aborted Command
Add. Sense: Internal target failure
end_request: I/O error, dev sde, sector 912637903
device-mapper: multipath: Failing path 8:64.
sd 1:0:3:1: SCSI error: return code = 0x08000002
sdc: Current: sense key: Aborted Command
Add. Sense: Internal target failure
end_request: I/O error, dev sdc, sector 915472343
device-mapper: multipath: Failing path 8:32.
sd 2:0:1:1: SCSI error: return code = 0x08000002
sde: Current: sense key: Aborted Command
Add. Sense: Internal target failure
end_request: I/O error, dev sde, sector 915472343
device-mapper: multipath: Failing path 8:64.
sd 2:0:1:1: SCSI error: return code = 0x08000002
sde: Current: sense key: Aborted Command
Add. Sense: Internal target failure
end_request: I/O error, dev sde, sector 919728103
device-mapper: multipath: Failing path 8:64.
sd 1:0:3:1: SCSI error: return code = 0x08000002
sdc: Current: sense key: Aborted Command
Add. Sense: Internal target failure
We have experienced same failures on both RHEL 5.1 and CentOS. Following is /etc/multipathd.conf
defaults {
user_friendly_names yes
path_grouping_policy multibus
}
devices {
device {
vendor "COMPELNT"
product "Compellent Vol"
path_checker tur
polling_interval 10
no_path_retry queue
}
}
blacklist {
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^(hd|xvd)[a-z]*"
wwid "*"
}
# Make sure our multipath devices are enabled.
blacklist_exceptions {
wwid "36000d310000e63000000000000000007"
wwid "36000d310000e6300000000000000000c"
}
# multipath -ll
mpath1 (36000d310000e6300000000000000000c) dm-5 COMPELNT,Compellent Vol
[size=500G][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=2][active]
\_ 1:0:3:1 sdc 8:32 [active][ready]
\_ 2:0:1:1 sde 8:64 [active][ready]
mpath0 (36000d310000e63000000000000000007) dm-0 COMPELNT,Compellent Vol
[size=50G][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=2][active]
\_ 1:0:3:0 sdb 8:16 [active][ready]
\_ 2:0:1:0 sdd 8:48 [active][ready]
Please let me know if you need more information. This is my first experience with SAN configuration and I feel that I have missed something very obvious, because I was not getting meaningful results for those search results.
Thanks,
Prakash.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20081023/edb75aef/attachment.htm>
More information about the dm-devel
mailing list