[dm-devel] rdac path failure - Sun 6140
Stewart Smith
stew at cleepdar.com
Thu Aug 13 20:34:36 UTC 2009
Same sequence of events, with multipathd -v3
Aug 13 16:28:48.627 kernel: device-mapper: multipath: Failing path 8:208.
Aug 13 16:28:48.000 multipathd: vol1: rr_weight = 2 (LUN setting)
Aug 13 16:28:48.000 multipathd: vol1: pgfailback = -2 (controller setting)
Aug 13 16:28:48.000 multipathd: pg_timeout = NONE (internal default)
Aug 13 16:28:48.000 multipathd: 8:208: mark as failed
Aug 13 16:28:48.000 multipathd: uevent 'change' from
'/devices/virtual/block/dm-1'
Aug 13 16:28:48.000 multipathd: UDEV_LOG=3
Aug 13 16:28:48.000 multipathd: ACTION=change
Aug 13 16:28:48.000 multipathd: DEVPATH=/devices/virtual/block/dm-1
Aug 13 16:28:48.000 multipathd: SUBSYSTEM=block
Aug 13 16:28:48.000 multipathd: DM_TARGET=multipath
Aug 13 16:28:48.000 multipathd: DM_ACTION=PATH_FAILED
Aug 13 16:28:48.000 multipathd: DM_SEQNUM=1
Aug 13 16:28:48.000 multipathd: DM_PATH=8:208
Aug 13 16:28:48.000 multipathd: DM_NR_VALID_PATHS=3
Aug 13 16:28:48.000 multipathd: DM_NAME=vol1
Aug 13 16:28:48.000 multipathd:
DM_UUID=mpath-3600a0b800048335200001e5d48b68a9b
Aug 13 16:28:48.000 multipathd: MAJOR=253
Aug 13 16:28:48.000 multipathd: MINOR=1
Aug 13 16:28:48.000 multipathd: DEVTYPE=disk
Aug 13 16:28:48.000 multipathd: SEQNUM=1738
Aug 13 16:28:48.000 multipathd: UDEVD_EVENT=1
Aug 13 16:28:48.000 multipathd: DEVNAME=/dev/dm-1
Aug 13 16:28:50.000 multipathd: 8:208: reinstated
Aug 13 16:28:50.000 multipathd: vol1: remaining active paths: 4
Aug 13 16:28:50.000 multipathd: sdj: rdac prio = 3
Aug 13 16:28:50.000 multipathd: sdn: rdac prio = 3
Aug 13 16:28:50.000 multipathd: sdb: rdac prio = 0
Aug 13 16:28:50.000 multipathd: sdd: rdac prio = 0
Aug 13 16:28:50.763 kernel: device-mapper: multipath: Failing path 8:208.
Aug 13 16:28:50.000 multipathd: uevent 'change' from
'/devices/virtual/block/dm-1'
Aug 13 16:28:50.000 multipathd: UDEV_LOG=3
Aug 13 16:28:50.000 multipathd: ACTION=change
Aug 13 16:28:50.000 multipathd: DEVPATH=/devices/virtual/block/dm-1
Aug 13 16:28:50.000 multipathd: SUBSYSTEM=block
Aug 13 16:28:50.000 multipathd: DM_TARGET=multipath
Aug 13 16:28:50.000 multipathd: DM_ACTION=PATH_REINSTATED
Aug 13 16:28:50.000 multipathd: DM_SEQNUM=2
Aug 13 16:28:50.000 multipathd: DM_PATH=8:208
Aug 13 16:28:50.000 multipathd: DM_NR_VALID_PATHS=4
Aug 13 16:28:50.000 multipathd: DM_NAME=vol1
Aug 13 16:28:50.000 multipathd:
DM_UUID=mpath-3600a0b800048335200001e5d48b68a9b
Aug 13 16:28:50.000 multipathd: MAJOR=253
Aug 13 16:28:50.000 multipathd: MINOR=1
Aug 13 16:28:50.000 multipathd: DEVTYPE=diskAug 13 16:28:50.000
multipathd: SEQNUM=1739Aug 13 16:28:50.000 multipathd: UDEVD_EVENT=1
Aug 13 16:28:50.000 multipathd: vol1: rr_weight = 2 (LUN setting)
Aug 13 16:28:50.000 multipathd: vol1: pgfailback = -2 (controller setting)
Aug 13 16:28:50.000 multipathd: pg_timeout = NONE (internal default)
Aug 13 16:28:50.000 multipathd: 8:208: mark as failed
Aug 13 16:28:50.000 multipathd: vol1: remaining active paths: 3
Aug 13 16:28:50.000 multipathd: vol1: rr_weight = 2 (LUN setting)
Aug 13 16:28:50.000 multipathd: vol1: pgfailback = -2 (controller setting)
Aug 13 16:28:50.000 multipathd: uevent 'change' from
'/devices/virtual/block/dm-1'
Aug 13 16:28:50.000 multipathd: UDEV_LOG=3
Aug 13 16:28:50.000 multipathd: ACTION=change
Aug 13 16:28:50.000 multipathd: DEVPATH=/devices/virtual/block/dm-1
Aug 13 16:28:50.000 multipathd: SUBSYSTEM=block
Aug 13 16:28:50.000 multipathd: DM_TARGET=multipath
Aug 13 16:28:50.000 multipathd: DM_ACTION=PATH_FAILED
Aug 13 16:28:50.000 multipathd: DM_SEQNUM=3
Aug 13 16:28:50.000 multipathd: DM_PATH=8:208
Aug 13 16:28:50.000 multipathd: DM_NR_VALID_PATHS=3
Aug 13 16:28:50.000 multipathd: DM_NAME=vol1
Aug 13 16:28:50.000 multipathd:
DM_UUID=mpath-3600a0b800048335200001e5d48b68a9b
Aug 13 16:28:50.000 multipathd: MAJOR=253
Aug 13 16:28:50.000 multipathd: MINOR=1
Aug 13 16:28:50.000 multipathd: DEVTYPE=disk
Aug 13 16:28:50.000 multipathd: SEQNUM=1740
Aug 13 16:28:50.000 multipathd: UDEVD_EVENT=1
Aug 13 16:28:50.000 multipathd: DEVNAME=/dev/dm-1
Aug 13 16:29:00.000 multipathd: 8:208: reinstated
Aug 13 16:29:00.000 multipathd: vol1: remaining active paths: 4
Aug 13 16:29:00.000 multipathd: sdj: rdac prio = 3
Aug 13 16:29:00.000 multipathd: sdn: rdac prio = 3
Aug 13 16:29:00.000 multipathd: sdb: rdac prio = 0
Aug 13 16:29:00.000 multipathd: sdd: rdac prio = 0
Aug 13 16:29:00.000 multipathd: vol1: rr_weight = 2 (LUN setting)
Aug 13 16:29:00.000 multipathd: vol1: pgfailback = -2 (controller setting)
Aug 13 16:29:00.000 multipathd: uevent 'change' from
'/devices/virtual/block/dm-1'
Aug 13 16:29:00.000 multipathd: UDEV_LOG=3
Aug 13 16:29:00.000 multipathd: ACTION=change
Aug 13 16:29:00.000 multipathd: DEVPATH=/devices/virtual/block/dm-1
Aug 13 16:29:00.000 multipathd: SUBSYSTEM=block
Aug 13 16:29:00.000 multipathd: DM_TARGET=multipath
Aug 13 16:29:00.000 multipathd: DM_ACTION=PATH_REINSTATED
Aug 13 16:29:00.000 multipathd: DM_SEQNUM=4
Aug 13 16:29:00.000 multipathd: DM_PATH=8:208
Aug 13 16:29:00.000 multipathd: DM_NR_VALID_PATHS=4
Aug 13 16:29:00.000 multipathd: DM_NAME=vol1
Aug 13 16:29:00.000 multipathd:
DM_UUID=mpath-3600a0b800048335200001e5d48b68a9b
Aug 13 16:29:00.000 multipathd: MAJOR=253
Aug 13 16:29:00.000 multipathd: MINOR=1
Aug 13 16:29:00.000 multipathd: DEVTYPE=disk
Aug 13 16:29:00.000 multipathd: SEQNUM=1741
Aug 13 16:29:00.000 multipathd: UDEVD_EVENT=1
Aug 13 16:29:00.000 multipathd: DEVNAME=/dev/dm-1
Aug 13 16:29:02.753 kernel: device-mapper: multipath: Failing path 8:208.
Aug 13 16:29:02.000 multipathd: vol1: rr_weight = 2 (LUN setting)
Aug 13 16:29:02.000 multipathd: vol1: pgfailback = -2 (controller setting)
Aug 13 16:29:02.000 multipathd: pg_timeout = NONE (internal default)
Aug 13 16:29:02.000 multipathd: 8:208: mark as failed
Aug 13 16:29:02.000 multipathd: uevent 'change' from
'/devices/virtual/block/dm-1'
Aug 13 16:29:02.000 multipathd: UDEV_LOG=3
Aug 13 16:29:02.000 multipathd: ACTION=change
Aug 13 16:29:02.000 multipathd: DEVPATH=/devices/virtual/block/dm-1
Aug 13 16:29:02.000 multipathd: SUBSYSTEM=block
Aug 13 16:29:02.000 multipathd: DM_TARGET=multipath
Aug 13 16:29:02.000 multipathd: DM_ACTION=PATH_FAILED
Aug 13 16:29:02.000 multipathd: DM_SEQNUM=5
Aug 13 16:29:02.000 multipathd: DM_PATH=8:208
Aug 13 16:29:02.000 multipathd: DM_NR_VALID_PATHS=3
Aug 13 16:29:02.000 multipathd: DM_NAME=vol1
Aug 13 16:29:02.000 multipathd:
DM_UUID=mpath-3600a0b800048335200001e5d48b68a9b
Aug 13 16:29:02.000 multipathd: MAJOR=253
Aug 13 16:29:02.000 multipathd: MINOR=1
Aug 13 16:29:02.000 multipathd: DEVTYPE=disk
Aug 13 16:29:02.000 multipathd: SEQNUM=1742
Aug 13 16:29:02.000 multipathd: UDEVD_EVENT=1
Aug 13 16:29:02.000 multipathd: DEVNAME=/dev/dm-1
Aug 13 16:29:10.000 multipathd: 8:208: reinstated
Aug 13 16:29:10.000 multipathd: vol1: remaining active paths: 4
Aug 13 16:29:10.000 multipathd: sdj: rdac prio = 3
Aug 13 16:29:10.000 multipathd: sdn: rdac prio = 3
Aug 13 16:29:10.000 multipathd: sdb: rdac prio = 0
Aug 13 16:29:10.000 multipathd: sdd: rdac prio = 0
Aug 13 16:29:10.000 multipathd: vol1: rr_weight = 2 (LUN setting)
Aug 13 16:29:10.000 multipathd: vol1: pgfailback = -2 (controller setting)
Aug 13 16:29:10.000 multipathd: uevent 'change' from
'/devices/virtual/block/dm-1'
Aug 13 16:29:10.000 multipathd: UDEV_LOG=3
Aug 13 16:29:10.000 multipathd: ACTION=change
Aug 13 16:29:10.000 multipathd: DEVPATH=/devices/virtual/block/dm-1
Aug 13 16:29:10.000 multipathd: SUBSYSTEM=block
Aug 13 16:29:10.000 multipathd: DM_TARGET=multipath
Aug 13 16:29:10.000 multipathd: DM_ACTION=PATH_REINSTATED
Aug 13 16:29:10.000 multipathd: DM_SEQNUM=6
Aug 13 16:29:10.000 multipathd: DM_PATH=8:208
Aug 13 16:29:10.000 multipathd: DM_NR_VALID_PATHS=4
Aug 13 16:29:10.000 multipathd: DM_NAME=vol1
Aug 13 16:29:10.000 multipathd:
DM_UUID=mpath-3600a0b800048335200001e5d48b68a9b
Aug 13 16:29:10.000 multipathd: MAJOR=253
Aug 13 16:29:10.000 multipathd: MINOR=1
Aug 13 16:29:10.000 multipathd: DEVTYPE=disk
Aug 13 16:29:10.000 multipathd: SEQNUM=1743
Aug 13 16:29:10.000 multipathd: UDEVD_EVENT=1
Aug 13 16:29:10.000 multipathd: DEVNAME=/dev/dm-1
On Thu, Aug 13, 2009 at 1:27 PM, Stewart Smith <stew at cleepdar.com> wrote:
>
> after a fresh, multipath -F and start of multipathd with -v 2 I see the
> following messages.
>
> After starting multipathd I mounted /dev/mapper/vol1 and generated some
> simple I/O to it using dd
>
>
> Aug 13 16:23:14.888 localhost kernel: device-mapper: multipath: Failing
> path 8:208.
> Aug 13 16:23:14.000 localhost multipathd: 8:208: mark as failed
> Aug 13 16:23:16.000 localhost multipathd: 8:208: reinstated
> Aug 13 16:23:30.462 localhost kernel: device-mapper: multipath: Failing
> path 8:208.
> Aug 13 16:23:30.000 localhost multipathd: 8:208: mark as failed
> Aug 13 16:23:39.000 localhost multipathd: 8:208: reinstated
> Aug 13 16:23:46.430 localhost kernel: device-mapper: multipath: Failing
> path 8:208.
> Aug 13 16:23:46.000 localhost multipathd: 8:208: mark as failed
> Aug 13 16:23:51.041 localhost kernel: device-mapper: multipath: Failing
> path 8:208.
> Aug 13 16:23:51.000 localhost multipathd: 8:208: mark as failed
> Aug 13 16:23:59.000 localhost multipathd: 8:208: reinstated
> Aug 13 16:24:06.465 localhost kernel: device-mapper: multipath: Failing
> path 8:208.
> Aug 13 16:24:06.000 localhost multipathd: 8:208: mark as failed
> Aug 13 16:24:09.000 localhost multipathd: 8:208: reinstated
>
>
> Thanks,
> --
> Stew
>
>
>
>
> On Thu, Aug 13, 2009 at 12:42 PM, Moger, Babu <Babu.Moger at lsi.com> wrote:
>
>> Do you have /var/log/messages file for this problem?
>>
>> Thanks
>> Babu Moger
>>
>> > -----Original Message-----
>> > From: dm-devel-bounces at redhat.com [mailto:dm-devel-bounces at redhat.com]
>> On
>> > Behalf Of Stewart Smith
>> > Sent: Thursday, August 13, 2009 1:51 PM
>> > To: dm-devel at redhat.com
>> > Subject: [dm-devel] rdac path failure - Sun 6140
>> >
>> > Hello All,
>> >
>> > I am seeing many of these messages when my Sun 6140 array is under heavy
>> > I/O
>> > device-mapper: multipath: Failing path 8:208.
>> > device-mapper: multipath: Failing path 8:208.
>> > device-mapper: multipath: Failing path 8:208.
>> > device-mapper: multipath: Failing path 8:208.
>> > device-mapper: multipath: Failing path 8:208.
>> >
>> >
>> > I am running a Fedora 10 server, with two fiber connections to two
>> > different switches. Both controllers on the 6140 have one connection
>> > to each switch as well. The end result is that I see four paths to
>> > each LUN.
>> >
>> > When the volume is mounted and under significant load I see the
>> > messages above every few seconds. They seem to appear every
>> > "no_path_retry" seconds.
>> >
>> > The 6140 controller firmware is up to date at version 07.50.08.10 and
>> > I have installed the latest firmware for my Emulex LPe11002 cards. I
>> > have reproduced the problem using both Cisco MDS and Brocade fiber
>> > channel switches as well.
>> >
>> > Using CAM, I have set the initiator Host Type to "Linux" at the
>> > moment. I have tried other options as well without success.
>> >
>> > I have NOT installed the RDAC drivers from either Sun or LSI -
>> > primarily because they do not seem to build on my Fedora 10 kernel.
>> >
>> > Any ideas would be greatly appreciated!!!
>> >
>> > configs and debugging multipathd output is below.
>> >
>> >
>> >
>> >
>> >
>> > Kernel: 2.6.27.24-170.2.68.fc10.x86_64
>> >
>> > # multipath -lll
>> > vol1 (3600a0b800048335200001e5d48b68a9b) dm-1 SUN,CSM200_R
>> > [size=12T][features=1 queue_if_no_path][hwhandler=1 rdac][rw]
>> > \_ round-robin 0 [prio=6][active]
>> > \_ 5:0:1:2 sdj 8:144 [active][ready]
>> > \_ 2:0:1:2 sdn 8:208 [active][ready]
>> > \_ round-robin 0 [prio=0][enabled]
>> > \_ 2:0:0:2 sdb 8:16 [active][ghost]
>> > \_ 5:0:0:2 sdd 8:48 [active][ghost]
>> >
>> >
>> > # cat /etc/multipath.conf
>> >
>> > blacklist {
>> > devnode "^sd[a-z][[0-9]*]"
>> > devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
>> > devnode "^hd[a-z][0-9]*"
>> > devnode "^cciss!c[0-9]d[0-9](p[0-9]*)*"
>> > }
>> >
>> > defaults {
>> > udev_dir /dev
>> > polling_interval 10
>> > selector "round-robin 0"
>> > path_grouping_policy multibus
>> > getuid_callout "/sbin/scsi_id --whitelisted /dev/%n"
>> > prio alua
>> > path_checker readsector0
>> > rr_min_io 100
>> > max_fds 8192
>> > rr_weight priorities
>> > failback immediate
>> > no_path_retry fail
>> > user_friendly_names yes
>> > }
>> > devices {
>> > device {
>> > vendor "SUN"
>> > product "CSM200_R"
>> > product_blacklist "Universal Xport"
>> > getuid_callout "/sbin/scsi_id --whitelisted
>> > /dev/%n"
>> > features "0"
>> > hardware_handler "1 rdac"
>> > path_selector "round-robin 0"
>> > path_grouping_policy group_by_prio
>> > failback immediate
>> > rr_weight uniform
>> > no_path_retry queue
>> > rr_min_io 1000
>> > path_checker rdac
>> > prio rdac
>> > }
>> > }
>> >
>> > multipaths {
>> > multipath {
>> > wwid
>> 3600a0b800048335200001e5d48b68a9b
>> > alias vol1
>> > rr_weight priorities
>> > no_path_retry 5
>> > rr_min_io 100
>> > }
>> > }
>> >
>> >
>> >
>> > # multipathd -d v3
>> >
>> >
>> > Aug 13 14:48:53 | sdb: ownership set to vol1
>> > Aug 13 14:48:53 | sdb: not found in pathvec
>> > Aug 13 14:48:53 | sdb: mask = 0xc
>> > Aug 13 14:48:53 | sdb: path checker = rdac (controller setting)
>> > Aug 13 14:48:53 | sdb: state = 4
>> > Aug 13 14:48:53 | sdb: rdac prio = 0
>> > Aug 13 14:48:53 | sdd: ownership set to vol1
>> > Aug 13 14:48:53 | sdd: not found in pathvec
>> > Aug 13 14:48:53 | sdd: mask = 0xc
>> > Aug 13 14:48:53 | sdd: path checker = rdac (controller setting)
>> > Aug 13 14:48:53 | sdd: state = 4
>> > Aug 13 14:48:53 | sdd: rdac prio = 0
>> > Aug 13 14:48:53 | sdj: ownership set to vol1
>> > Aug 13 14:48:53 | sdj: not found in pathvec
>> > Aug 13 14:48:53 | sdj: mask = 0xc
>> > Aug 13 14:48:53 | sdj: path checker = rdac (controller setting)
>> > Aug 13 14:48:53 | sdj: state = 2
>> > Aug 13 14:48:53 | sdj: rdac prio = 3
>> > Aug 13 14:48:53 | sdn: ownership set to vol1
>> > Aug 13 14:48:53 | sdn: not found in pathvec
>> > Aug 13 14:48:53 | sdn: mask = 0xc
>> > Aug 13 14:48:53 | sdn: path checker = rdac (controller setting)
>> > Aug 13 14:48:53 | sdn: state = 2
>> > Aug 13 14:48:53 | sdn: rdac prio = 3
>> > Aug 13 14:48:53 | vol1: pgfailback = -2 (controller setting)
>> > Aug 13 14:48:53 | vol1: pgpolicy = group_by_prio (controller setting)
>> > Aug 13 14:48:53 | vol1: selector = round-robin 0 (controller setting)
>> > Aug 13 14:48:53 | vol1: features = 0 (controller setting)
>> > Aug 13 14:48:53 | vol1: hwhandler = 1 rdac (controller setting)
>> > Aug 13 14:48:53 | vol1: rr_weight = 2 (LUN setting)
>> > Aug 13 14:48:53 | vol1: minio = 100 (LUN setting)
>> > Aug 13 14:48:53 | vol1: no_path_retry = 5 (multipath setting)
>> > Aug 13 14:48:53 | pg_timeout = NONE (internal default)
>> > Aug 13 14:48:53 | vol1: set ACT_CREATE (map does not exist)
>> > create: vol1 (3600a0b800048335200001e5d48b68a9b) n/a SUN,CSM200_R
>> > [size=12T][features=0][hwhandler=1 rdac][n/a]
>> > \_ round-robin 0 [prio=6][undef]
>> > \_ 5:0:1:2 sdj 8:144 [undef][ready]
>> > \_ 2:0:1:2 sdn 8:208 [undef][ready]
>> > \_ round-robin 0 [prio=0][undef]
>> > \_ 2:0:0:2 sdb 8:16 [undef][ghost]
>> > \_ 5:0:0:2 sdd 8:48 [undef][ghost]
>> >
>> > --
>> > dm-devel mailing list
>> > dm-devel at redhat.com
>> > https://www.redhat.com/mailman/listinfo/dm-devel
>>
>> --
>> dm-devel mailing list
>> dm-devel at redhat.com
>> https://www.redhat.com/mailman/listinfo/dm-devel
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20090813/20741c8d/attachment.htm>
More information about the dm-devel
mailing list