[dm-devel] Tuning suggestions for large systems with many (6000+) paths
Jeff Wasilko
jwasilko at ebay.com
Thu Sep 26 18:19:33 UTC 2013
Hello!
We have a 3-node database cluster running Oracle supporting a data warehouse application. To get the thruput we need, we have 4 dual-port HBAs, and the hosts are zoned to 8 or 12 array ports. This ends up giving us a huge # of paths for DM to manage:
root at dbxx [/nfshome/jwasilko ] multipathd paths count
Paths: 6000
Busy: False
We're running OEL 6.3, and:
device-mapper-multipath-libs-0.4.9-64.0.1.el6.x86_64
device-mapper-multipath-0.4.9-64.0.1.el6.x86_64
We've seen issues like high CPU usage when paths return, and we're struggling to understand why we see a 30 second I/O stall across all paths when one path is failed (by disabling the switch port in the FC switch).
Here's our multipath.conf:
blacklist { devnode "*"
}
blacklist_exceptions {
devnode "sd*"
}
defaults {
user_friendly_names yes
find_multipaths yes
fast_io_fail_tmo 1
dev_loss_tmo 30
checker_timeout 10
failback immediate
rr_weight uniform
no_path_retry fail
max_fds 8192
path_checker tur
#rr_min_io 8
rr_min_io_rq 8
polling_interval 5
path_grouping_policy multibus
path_selector "round-robin 0"
}
devices {
device {
vendor "VIOLIN"
product "SAN ARRAY"
path_grouping_policy group_by_serial
getuid_callout "/sbin/scsi_id --whitelisted --replace-whitespace --page=0x80 --device=/dev/%n"
features "1 queue_if_no_path"
hardware_handler "0"
}
device {
vendor "3PARdata"
product "VV"
path_grouping_policy multibus
path_checker tur
no_path_retry 12
features "0"
hardware_handler "0"
path_selector "round-robin 0"
#path_selector "queue-length 0"
rr_weight uniform
rr_min_io 100
failback immediate
# fast_io_fail_tmo 1
# dev_loss_tmo 30
}
}
Thanks,
-jeff
A little more detail on the # of luns we have:
root at dbxx [/nfshome/jwasilko ] multipathd show multipaths status
name failback queueing paths dm-st write_prot
dbxx-data-prd-t2-78 immediate 12 chk 48 active rw
dbxx-data-prd-t1-21 immediate off 32 active rw
dbxx-data-prd-t1-22 immediate off 32 active rw
dbxx-data-prd-t1-23 immediate off 32 active rw
dbxx-data-prd-t1-18 immediate off 32 active rw
dbxx-data-prd-t1-24 immediate off 32 active rw
dbxx-data-prd-t1-19 immediate off 32 active rw
dbxx-data-prd-t1-25 immediate off 32 active rw
dbxx-data-prd-t1-20 immediate off 32 active rw
dbxx-data-prd-t1-27 immediate off 32 active rw
dbxx-data-prd-t1-29 immediate off 32 active rw
dbxx-data-prd-t1-28 immediate off 32 active rw
dbxx-data-prd-t1-30 immediate off 32 active rw
dbxx-data-prd-t1-31 immediate off 32 active rw
dbxx-reco2-1 immediate off 32 active rw
dbxx-reco2-2 immediate off 32 active rw
dbxx-reco2-3 immediate off 32 active rw
dbxx-data-prd-t1-26 immediate off 32 active rw
dbxx-data-prd-t1-17 immediate off 32 active rw
dbxx-reco2-4 immediate off 32 active rw
dbxx-vote-0 immediate 12 chk 48 active rw
dbxx-vote-1 immediate 12 chk 48 active rw
dbxx-vote-2 immediate 12 chk 48 active rw
dbxx-acfs-0 immediate 12 chk 48 active rw
dbxx-acfs-1 immediate 12 chk 48 active rw
dbxx-acfs-2 immediate 12 chk 48 active rw
dbxx-acfs-3 immediate 12 chk 48 active rw
dbxx-acfs-4 immediate 12 chk 48 active rw
dbxx-acfs-5 immediate 12 chk 48 active rw
dbxx-acfs-6 immediate 12 chk 48 active rw
dbxx-acfs-7 immediate 12 chk 48 active rw
dbxx-arch-0 immediate 12 chk 48 active rw
dbxx-arch-1 immediate 12 chk 48 active rw
dbxx-arch-2 immediate 12 chk 48 active rw
dbxx-arch-3 immediate 12 chk 48 active rw
dbxx-arch-4 immediate 12 chk 48 active rw
dbxx-arch-7 immediate 12 chk 48 active rw
dbxx-arch-6 immediate 12 chk 48 active rw
dbxx-data-prd-t2-0 immediate 12 chk 48 active rw
dbxx-arch-5 immediate 12 chk 48 active rw
dbxx-data-prd-t2-1 immediate 12 chk 48 active rw
dbxx-data-prd-t2-2 immediate 12 chk 48 active rw
dbxx-data-prd-t2-4 immediate 12 chk 48 active rw
dbxx-data-prd-t2-3 immediate 12 chk 48 active rw
dbxx-data-prd-t2-5 immediate 12 chk 48 active rw
dbxx-data-prd-t2-6 immediate 12 chk 48 active rw
dbxx-data-prd-t2-7 immediate 12 chk 48 active rw
dbxx-data-prd-t2-8 immediate 12 chk 48 active rw
dbxx-data-prd-t2-9 immediate 12 chk 48 active rw
dbxx-data-prd-t2-10 immediate 12 chk 48 active rw
dbxx-data-prd-t2-11 immediate 12 chk 48 active rw
dbxx-data-prd-t2-12 immediate 12 chk 48 active rw
dbxx-data-prd-t2-13 immediate 12 chk 48 active rw
dbxx-data-prd-t2-14 immediate 12 chk 48 active rw
dbxx-data-prd-t2-15 immediate 12 chk 48 active rw
dbxx-data-prd-t2-16 immediate 12 chk 48 active rw
dbxx-data-prd-t2-17 immediate 12 chk 48 active rw
dbxx-data-prd-t2-19 immediate 12 chk 48 active rw
dbxx-data-prd-t2-18 immediate 12 chk 48 active rw
dbxx-data-prd-t2-20 immediate 12 chk 48 active rw
dbxx-data-prd-t2-22 immediate 12 chk 48 active rw
dbxx-data-prd-t2-21 immediate 12 chk 48 active rw
dbxx-data-prd-t2-23 immediate 12 chk 48 active rw
dbxx-data-prd-t2-24 immediate 12 chk 48 active rw
dbxx-data-prd-t2-25 immediate 12 chk 48 active rw
dbxx-data-prd-t2-26 immediate 12 chk 48 active rw
dbxx-data-prd-t2-27 immediate 12 chk 48 active rw
dbxx-data-prd-t2-28 immediate 12 chk 48 active rw
dbxx-data-prd-t2-29 immediate 12 chk 48 active rw
dbxx-data-prd-t2-31 immediate 12 chk 48 active rw
dbxx-data-prd-t2-32 immediate 12 chk 48 active rw
dbxx-data-prd-t2-30 immediate 12 chk 48 active rw
dbxx-data-prd-t2-33 immediate 12 chk 48 active rw
dbxx-data-prd-t2-34 immediate 12 chk 48 active rw
dbxx-data-prd-t2-35 immediate 12 chk 48 active rw
dbxx-data-prd-t2-36 immediate 12 chk 48 active rw
dbxx-data-prd-t2-39 immediate 12 chk 48 active rw
dbxx-data-prd-t2-37 immediate 12 chk 48 active rw
dbxx-data-prd-t2-38 immediate 12 chk 48 active rw
dbxx-data-prd-t2-40 immediate 12 chk 48 active rw
dbxx-data-prd-t2-41 immediate 12 chk 48 active rw
dbxx-data-prd-t2-42 immediate 12 chk 48 active rw
dbxx-data-prd-t2-43 immediate 12 chk 48 active rw
dbxx-data-prd-t2-44 immediate 12 chk 48 active rw
dbxx-data-prd-t2-45 immediate 12 chk 48 active rw
dbxx-data-prd-t2-46 immediate 12 chk 48 active rw
dbxx-data-prd-t2-47 immediate 12 chk 48 active rw
dbxx-data-prd-t2-48 immediate 12 chk 48 active rw
dbxx-data-prd-t2-49 immediate 12 chk 48 active rw
dbxx-data-prd-t2-50 immediate 12 chk 48 active rw
dbxx-data-prd-t2-51 immediate 12 chk 48 active rw
dbxx-data-prd-t2-52 immediate 12 chk 48 active rw
dbxx-data-prd-t2-53 immediate 12 chk 48 active rw
dbxx-data-prd-t2-54 immediate 12 chk 48 active rw
dbxx-data-prd-t2-55 immediate 12 chk 48 active rw
dbxx-data-prd-t2-56 immediate 12 chk 48 active rw
dbxx-data-prd-t2-57 immediate 12 chk 48 active rw
dbxx-data-prd-t2-58 immediate 12 chk 48 active rw
dbxx-data-prd-t2-59 immediate 12 chk 48 active rw
dbxx-data-prd-t2-60 immediate 12 chk 48 active rw
dbxx-data-prd-t2-61 immediate 12 chk 48 active rw
dbxx-data-prd-t2-62 immediate 12 chk 48 active rw
dbxx-data-prd-t2-63 immediate 12 chk 48 active rw
dbxx-data-prd-t2-64 immediate 12 chk 48 active rw
dbxx-data-prd-t2-65 immediate 12 chk 48 active rw
dbxx-data-prd-t2-66 immediate 12 chk 48 active rw
dbxx-data-prd-t2-67 immediate 12 chk 48 active rw
dbxx-data-prd-t2-68 immediate 12 chk 48 active rw
dbxx-data-prd-t2-69 immediate 12 chk 48 active rw
dbxx-data-prd-t2-70 immediate 12 chk 48 active rw
dbxx-data-prd-t2-71 immediate 12 chk 48 active rw
dbxx-data-prd-t2-72 immediate 12 chk 48 active rw
dbxx-data-prd-t2-73 immediate 12 chk 48 active rw
dbxx-data-prd-t2-74 immediate 12 chk 48 active rw
dbxx-data-prd-t2-75 immediate 12 chk 48 active rw
dbxx-data-prd-t2-76 immediate 12 chk 48 active rw
dbxx-data-prd-t2-77 immediate 12 chk 48 active rw
dbxx-data-prd-t2-79 immediate 12 chk 48 active rw
dbxx-data-prd-t1-10 immediate off 32 active rw
dbxx-data-prd-t1-11 immediate off 32 active rw
dbxx-data-prd-t1-12 immediate off 32 active rw
dbxx-data-prd-t1-13 immediate off 32 active rw
dbxx-data-prd-t1-14 immediate off 32 active rw
dbxx-data-prd-t1-15 immediate off 32 active rw
dbxx-data-prd-t1-16 immediate off 32 active rw
dbxx-reco1-1 immediate off 32 active rw
dbxx-reco1-2 immediate off 32 active rw
dbxx-reco1-3 immediate off 32 active rw
dbxx-data-prd-t1-03 immediate off 32 active rw
dbxx-data-prd-t1-06 immediate off 32 active rw
dbxx-data-prd-t1-04 immediate off 32 active rw
dbxx-data-prd-t1-01 immediate off 32 active rw
dbxx-data-prd-t1-02 immediate off 32 active rw
dbxx-data-prd-t1-08 immediate off 32 active rw
dbxx-reco1-4 immediate off 32 active rw
dbxx-data-prd-t1-05 immediate off 32 active rw
dbxx-data-prd-t1-07 immediate off 32 active rw
dbxx-data-prd-t1-09 immediate off 32 active rw
--
Jeff Wasilko
Technical Architect, EEMS Platform Ops
eBay Enterprise
781 372 4992 M 781 820 0882 F 781 863 8118
jwasilko at ebay.com ebayenterprise.com
[cid:image001.gif at 01CEBAC1.6E2BECF0]
________________________________
The information contained in this electronic mail transmission is intended only for the use of the individual or entity named in this transmission. If you are not the intended recipient of this transmission, you are hereby notified that any disclosure, copying or distribution of the contents of this transmission is strictly prohibited and that you should delete the contents of this transmission from your system immediately. Any comments or statements contained in this transmission do not necessarily reflect the views or position of eBay Enterprise. or its subsidiaries and/or affiliates.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20130926/ff371a56/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 3926 bytes
Desc: image001.gif
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20130926/ff371a56/attachment.gif>
More information about the dm-devel
mailing list