[linux-lvm] Lvm hangs on San fail
brem belguebli
brem.belguebli at gmail.com
Sat Apr 17 09:00:59 UTC 2010
Hi Jose,
You have a total of 8 paths per LUN, 4 are marked active thru HBA host5
and the remaining 4 are marked enabled on HBA3 (you're on 2 differnet
FABRICS right ?) , this may due to the fact that you use policy
group_by_node_name. I don't know if this mode if it actually load
balances across the 2 HBA's.
When you pull the cable (this is the test that you're doing and that s
failling ?) you say it times out forever.
As you're in policy group_by_node_name, which corresponds to the
fc_transport target node name you should look at the state of the target
ports bound to the HBA you disconnected (is it the test you're doing?)
(state Blocked ?) /sys/class/fc_remote_ports/rport:H:B-R (where H is
your HBA number )forever due to may dev_loss_tmo or fast_io_fail_tmo too
high (both timers are located under /sys/class/fc_remote_ports/rport....
I have almost the same setup with almost the same storage (OPEN-V) from
a pair of HP XP (OEM'ized Hitachi arrays) and things are setup to use
maximum 4 paths per LUN (2 per fabric), some storage experts tend to say
it is already too much, and as multipath policy I use multibus to
distribute across the 2 fabrics.
Hope all this will help
you say this happens when you pull the fiber cable from the server
On Fri, 2010-04-16 at 08:55 +0000, jose nuno neto wrote:
> Hi
>
>
> > Can you show us a pvdisplay or verbose vgdisplay ?
> >
>
> Here goes the vgdisplay -v of one of the vgs with mirrors
>
> ###########################################################
>
> --- Volume group ---
> VG Name vg_ora_jura
> System ID
> Format lvm2
> Metadata Areas 3
> Metadata Sequence No 705
> VG Access read/write
> VG Status resizable
> MAX LV 0
> Cur LV 4
> Open LV 4
> Max PV 0
> Cur PV 3
> Act PV 3
> VG Size 52.79 GB
> PE Size 4.00 MB
> Total PE 13515
> Alloc PE / Size 12292 / 48.02 GB
> Free PE / Size 1223 / 4.78 GB
> VG UUID nttQ3x-4ecP-Q6ms-jt2u-UIs4-texj-Q9Nxdt
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_arch
> VG Name vg_ora_jura
> LV UUID 8oUfYn-2TrP-yS6K-pcS2-cgI4-tcv1-33dSdX
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:28
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_export
> VG Name vg_ora_jura
> LV UUID NLfQT6-36TS-DRHq-PJRf-9UDv-L8mz-HjPea2
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:32
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_data
> VG Name vg_ora_jura
> LV UUID VtSBIL-XvCw-23xK-NVAH-DvYn-P2sE-OkZJro
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 12.00 GB
> Current LE 3072
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:40
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_redo
> VG Name vg_ora_jura
> LV UUID KRHKBG-71Qv-YBsA-oJDt-igzP-EYaI-gPwcBX
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 2.00 GB
> Current LE 512
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:48
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mimage_0
> VG Name vg_ora_jura
> LV UUID lQCOAt-aoK3-HBp1-xrQW-eh7L-6t94-CyAg5c
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:26
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mimage_1
> VG Name vg_ora_jura
> LV UUID snrnPc-8FxY-ekAk-ooNe-sBws-tuI0-cTFfj3
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:27
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mlog
> VG Name vg_ora_jura
> LV UUID ouqaCQ-Deex-iArv-xLe9-jg8b-5cLf-3SChQ1
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 4.00 MB
> Current LE 1
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:25
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mlog
> VG Name vg_ora_jura
> LV UUID TmE2S0-r8ST-v624-RxUn-Qppw-2l8p-jM9EC9
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 4.00 MB
> Current LE 1
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:37
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mimage_0
> VG Name vg_ora_jura
> LV UUID 8hR0bP-g9mR-OSXS-KdUM-ouZ6-KVdS-sfz51c
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 12.00 GB
> Current LE 3072
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:38
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mimage_1
> VG Name vg_ora_jura
> LV UUID fzdzrD-7p6d-XFkA-UHyr-CPad-F2nV-6QIU9p
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 12.00 GB
> Current LE 3072
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:39
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mlog
> VG Name vg_ora_jura
> LV UUID 29yLY8-N3Lv-46pN-1jze-50A2-wlhu-quuoMa
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 4.00 MB
> Current LE 1
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:29
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mimage_0
> VG Name vg_ora_jura
> LV UUID 1uMTsf-wPaQ-ItTy-rpma-m2La-TGZl-C4KIU4
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:30
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mimage_1
> VG Name vg_ora_jura
> LV UUID cm8Kn7-knL3-mUPL-XFvU-geMm-Wxff-32x2va
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:31
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mlog
> VG Name vg_ora_jura
> LV UUID 811tNy-eaC5-zfZQ-1QVf-cbYP-1MIM-v6waJF
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 4.00 MB
> Current LE 1
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:45
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mimage_0
> VG Name vg_ora_jura
> LV UUID aUZAer-f5rl-1f2X-9jgY-f8CJ-jdwe-F5Pmao
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 2.00 GB
> Current LE 512
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:46
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mimage_1
> VG Name vg_ora_jura
> LV UUID gAEJym-sSbq-rC4P-AjpI-OibV-k3yI-lDx1I6
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 2.00 GB
> Current LE 512
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:47
>
> --- Physical volumes ---
> PV Name /dev/mapper/mpath-dc1-b
> PV UUID hgjXU1-2qjo-RsmS-1XJI-d0kZ-oc4A-ZKCza8
> PV Status allocatable
> Total PE / Free PE 6749 / 605
>
> PV Name /dev/mapper/mpath-dc2-b
> PV UUID hcANwN-aeJT-PIAq-bPsf-9d3e-ylkS-GDjAGR
> PV Status allocatable
> Total PE / Free PE 6749 / 605
>
> PV Name /dev/mapper/mpath-dc2-mlog1p1
> PV UUID 4l9Qvo-SaAV-Ojlk-D1YB-Tkud-Yjg0-e5RkgJ
> PV Status allocatable
> Total PE / Free PE 17 / 13
>
>
>
> > On 4/15/10, jose nuno neto <jose.neto at liber4e.com> wrote:
> >> hellos
> >>
> >> I spent more time on this and it seems since LVM cant write to any pv on
> >> the volumes it has lost, it cannot write the failure of the devices and
> >> update the metadata on other PVs. So it hangs forever
> >>
> >> Is this right?
> >>
> >>> GoodMornings
> >>>
> >>> This is what I have on multipath.conf
> >>>
> >>> blacklist {
> >>> wwid SSun_VOL0_266DCF4A
> >>> wwid SSun_VOL0_5875CF4A
> >>> devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
> >>> devnode "^hd[a-z]"
> >>> }
> >>> defaults {
> >>> user_friendly_names yes
> >>> }
> >>> devices {
> >>> device {
> >>> vendor "HITACHI"
> >>> product "OPEN-V"
> >>> path_grouping_policy group_by_node_name
> >>> failback immediate
> >>> no_path_retry fail
> >>> }
> >>> device {
> >>> vendor "IET"
> >>> product "VIRTUAL-DISK"
> >>> path_checker tur
> >>> path_grouping_policy failover
> >>> failback immediate
> >>> no_path_retry fail
> >>> }
> >>> }
> >>>
> >>> As an example this is one LUN. It shoes [features=0] so I'd say it
> >>> should
> >>> fail right way
> >>>
> >>> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V
> >>> -SU
> >>> [size=26G][features=0][hwhandler=0][rw]
> >>> \_ round-robin 0 [prio=4][active]
> >>> \_ 5:0:1:0 sdu 65:64 [active][ready]
> >>> \_ 5:0:1:16384 sdac 65:192 [active][ready]
> >>> \_ 5:0:1:32768 sdas 66:192 [active][ready]
> >>> \_ 5:0:1:49152 sdba 67:64 [active][ready]
> >>> \_ round-robin 0 [prio=4][enabled]
> >>> \_ 3:0:1:0 sdaw 67:0 [active][ready]
> >>> \_ 3:0:1:16384 sdbe 67:128 [active][ready]
> >>> \_ 3:0:1:32768 sdbi 67:192 [active][ready]
> >>> \_ 3:0:1:49152 sdbm 68:0 [active][ready]
> >>>
> >>> It think they fail since I see this messages from LVM:
> >>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> >>> vg_syb_roger-lv_syb_roger_admin
> >>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices
> >>> in
> >>> vg_syb_roger-lv_syb_roger_admin
> >>>
> >>> But from some reason LVM cant remove them, any option I should have on
> >>> lvm.conf?
> >>>
> >>> BestRegards
> >>> Jose
> >>>> post your multipath.conf file, you may be queuing forever ?
> >>>>
> >>>>
> >>>>
> >>>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
> >>>>> Hi2all
> >>>>>
> >>>>> I'm on RHEL 5.4 with
> >>>>> lvm2-2.02.46-8.el5_4.1
> >>>>> 2.6.18-164.2.1.el5
> >>>>>
> >>>>> I have a multipathed SAN connection with what Im builing LVs
> >>>>> Its a Cluster system, and I want LVs to switch on failure
> >>>>>
> >>>>> If I simulate a fail through the OS via
> >>>>> /sys/bus/scsi/devices/$DEVICE/delete
> >>>>> I get a LV fail and the service switch to other node
> >>>>>
> >>>>> But if I do it "real" portdown on the SAN Switch, multipath reports
> >>>>> path
> >>>>> down, but LVM commands hang forever and nothing gets switched
> >>>>>
> >>>>> from the logs i see multipath failing paths, and lvm Failed to remove
> >>>>> faulty
> >>>>> "devices"
> >>>>>
> >>>>> Any ideas how I should "fix" it?
> >>>>>
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has
> >>>>> failed.
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
> >>>>> vg_ora_scapa-lv_ora_scapa_redo
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
> >>>>> event. Waiting...
> >>>>>
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> >>>>> paths: 0
> >>>>>
> >>>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> >>>>> vg_syb_roger-lv_syb_roger_admin
> >>>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty
> >>>>> devices
> >>>>> in
> >>>>> vg_syb_roger-lv_syb_roger_admin
> >>>>>
> >>>>> Much Thanks
> >>>>> Jose
> >>>>>
> >>>>> _______________________________________________
> >>>>> linux-lvm mailing list
> >>>>> linux-lvm at redhat.com
> >>>>> https://www.redhat.com/mailman/listinfo/linux-lvm
> >>>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> linux-lvm mailing list
> >>>> linux-lvm at redhat.com
> >>>> https://www.redhat.com/mailman/listinfo/linux-lvm
> >>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>>>
> >>>
> >>>
> >>
> >> _______________________________________________
> >> linux-lvm mailing list
> >> linux-lvm at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-lvm
> >> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>
> >
> > --
> > Sent from my mobile device
> >
> > Regards,
> > Eugene Vilensky
> > evilensky at gmail.com
> >
> > _______________________________________________
> > linux-lvm mailing list
> > linux-lvm at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-lvm
> > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
More information about the linux-lvm
mailing list