[linux-lvm] Lvm hangs on San fail

brem belguebli brem.belguebli at gmail.com
Sat Apr 17 09:00:59 UTC 2010


Hi Jose,

You have a total of 8 paths per LUN, 4 are marked active thru HBA host5
and the remaining 4 are marked enabled on HBA3 (you're on 2 differnet
FABRICS right ?) , this may due to the fact that you use policy
group_by_node_name. I don't know if this mode if it actually load
balances across the 2 HBA's.


When you pull the cable (this is the test that you're doing and that s
failling ?) you say it times out forever. 
As you're in policy group_by_node_name, which corresponds to the
fc_transport target node name you should look at the state of the target
ports bound to the HBA you disconnected (is it the test you're doing?)
(state Blocked ?) /sys/class/fc_remote_ports/rport:H:B-R (where H is
your HBA number )forever due to may dev_loss_tmo or fast_io_fail_tmo too
high (both timers are located under /sys/class/fc_remote_ports/rport....

I have almost the same setup with almost the same storage (OPEN-V) from
a pair of HP XP (OEM'ized Hitachi arrays) and things are setup to use
maximum 4 paths per LUN (2 per fabric), some storage experts tend to say
it is already too much, and as multipath policy I use multibus to
distribute across the 2 fabrics.

Hope all this will help  

   




 
you say this happens when you pull the fiber cable from the server 

On Fri, 2010-04-16 at 08:55 +0000, jose nuno neto wrote:
> Hi
> 
> 
> > Can you show us a pvdisplay or verbose vgdisplay ?
> >
> 
> Here goes the vgdisplay -v of one of the vgs with mirrors
> 
> ###########################################################
> 
> --- Volume group ---
>   VG Name               vg_ora_jura
>   System ID
>   Format                lvm2
>   Metadata Areas        3
>   Metadata Sequence No  705
>   VG Access             read/write
>   VG Status             resizable
>   MAX LV                0
>   Cur LV                4
>   Open LV               4
>   Max PV                0
>   Cur PV                3
>   Act PV                3
>   VG Size               52.79 GB
>   PE Size               4.00 MB
>   Total PE              13515
>   Alloc PE / Size       12292 / 48.02 GB
>   Free  PE / Size       1223 / 4.78 GB
>   VG UUID               nttQ3x-4ecP-Q6ms-jt2u-UIs4-texj-Q9Nxdt
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_arch
>   VG Name                vg_ora_jura
>   LV UUID                8oUfYn-2TrP-yS6K-pcS2-cgI4-tcv1-33dSdX
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                5.00 GB
>   Current LE             1280
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:28
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_export
>   VG Name                vg_ora_jura
>   LV UUID                NLfQT6-36TS-DRHq-PJRf-9UDv-L8mz-HjPea2
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                5.00 GB
>   Current LE             1280
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:32
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_data
>   VG Name                vg_ora_jura
>   LV UUID                VtSBIL-XvCw-23xK-NVAH-DvYn-P2sE-OkZJro
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                12.00 GB
>   Current LE             3072
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:40
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_redo
>   VG Name                vg_ora_jura
>   LV UUID                KRHKBG-71Qv-YBsA-oJDt-igzP-EYaI-gPwcBX
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                2.00 GB
>   Current LE             512
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:48
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_arch_mimage_0
>   VG Name                vg_ora_jura
>   LV UUID                lQCOAt-aoK3-HBp1-xrQW-eh7L-6t94-CyAg5c
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                5.00 GB
>   Current LE             1280
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:26
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_arch_mimage_1
>   VG Name                vg_ora_jura
>   LV UUID                snrnPc-8FxY-ekAk-ooNe-sBws-tuI0-cTFfj3
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                5.00 GB
>   Current LE             1280
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:27
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_arch_mlog
>   VG Name                vg_ora_jura
>   LV UUID                ouqaCQ-Deex-iArv-xLe9-jg8b-5cLf-3SChQ1
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                4.00 MB
>   Current LE             1
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:25
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_data_mlog
>   VG Name                vg_ora_jura
>   LV UUID                TmE2S0-r8ST-v624-RxUn-Qppw-2l8p-jM9EC9
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                4.00 MB
>   Current LE             1
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:37
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_data_mimage_0
>   VG Name                vg_ora_jura
>   LV UUID                8hR0bP-g9mR-OSXS-KdUM-ouZ6-KVdS-sfz51c
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                12.00 GB
>   Current LE             3072
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:38
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_data_mimage_1
>   VG Name                vg_ora_jura
>   LV UUID                fzdzrD-7p6d-XFkA-UHyr-CPad-F2nV-6QIU9p
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                12.00 GB
>   Current LE             3072
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:39
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_export_mlog
>   VG Name                vg_ora_jura
>   LV UUID                29yLY8-N3Lv-46pN-1jze-50A2-wlhu-quuoMa
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                4.00 MB
>   Current LE             1
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:29
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_export_mimage_0
>   VG Name                vg_ora_jura
>   LV UUID                1uMTsf-wPaQ-ItTy-rpma-m2La-TGZl-C4KIU4
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                5.00 GB
>   Current LE             1280
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:30
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_export_mimage_1
>   VG Name                vg_ora_jura
>   LV UUID                cm8Kn7-knL3-mUPL-XFvU-geMm-Wxff-32x2va
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                5.00 GB
>   Current LE             1280
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:31
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_redo_mlog
>   VG Name                vg_ora_jura
>   LV UUID                811tNy-eaC5-zfZQ-1QVf-cbYP-1MIM-v6waJF
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                4.00 MB
>   Current LE             1
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:45
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_redo_mimage_0
>   VG Name                vg_ora_jura
>   LV UUID                aUZAer-f5rl-1f2X-9jgY-f8CJ-jdwe-F5Pmao
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                2.00 GB
>   Current LE             512
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:46
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_redo_mimage_1
>   VG Name                vg_ora_jura
>   LV UUID                gAEJym-sSbq-rC4P-AjpI-OibV-k3yI-lDx1I6
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                2.00 GB
>   Current LE             512
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:47
> 
>   --- Physical volumes ---
>   PV Name               /dev/mapper/mpath-dc1-b
>   PV UUID               hgjXU1-2qjo-RsmS-1XJI-d0kZ-oc4A-ZKCza8
>   PV Status             allocatable
>   Total PE / Free PE    6749 / 605
> 
>   PV Name               /dev/mapper/mpath-dc2-b
>   PV UUID               hcANwN-aeJT-PIAq-bPsf-9d3e-ylkS-GDjAGR
>   PV Status             allocatable
>   Total PE / Free PE    6749 / 605
> 
>   PV Name               /dev/mapper/mpath-dc2-mlog1p1
>   PV UUID               4l9Qvo-SaAV-Ojlk-D1YB-Tkud-Yjg0-e5RkgJ
>   PV Status             allocatable
>   Total PE / Free PE    17 / 13
> 
> 
> 
> > On 4/15/10, jose nuno neto <jose.neto at liber4e.com> wrote:
> >> hellos
> >>
> >> I spent more time on this and it seems since LVM cant write to any pv on
> >> the  volumes it has lost, it cannot write the failure of the devices and
> >> update the metadata on other PVs. So it hangs forever
> >>
> >> Is this right?
> >>
> >>> GoodMornings
> >>>
> >>> This is what I have on multipath.conf
> >>>
> >>> blacklist {
> >>>         wwid SSun_VOL0_266DCF4A
> >>>         wwid SSun_VOL0_5875CF4A
> >>>         devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
> >>>         devnode "^hd[a-z]"
> >>> }
> >>> defaults {
> >>>                 user_friendly_names             yes
> >>> }
> >>> devices {
> >>>        device {
> >>>                 vendor                          "HITACHI"
> >>>                 product                         "OPEN-V"
> >>>                 path_grouping_policy            group_by_node_name
> >>>                 failback                        immediate
> >>>                 no_path_retry                   fail
> >>>        }
> >>>        device {
> >>>                 vendor                          "IET"
> >>>                 product                         "VIRTUAL-DISK"
> >>>                 path_checker                    tur
> >>>                 path_grouping_policy            failover
> >>>                 failback                        immediate
> >>>                 no_path_retry                   fail
> >>>        }
> >>> }
> >>>
> >>> As an example this is one LUN. It shoes [features=0] so I'd say it
> >>> should
> >>> fail right way
> >>>
> >>> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V
> >>> -SU
> >>> [size=26G][features=0][hwhandler=0][rw]
> >>> \_ round-robin 0 [prio=4][active]
> >>>  \_ 5:0:1:0     sdu  65:64  [active][ready]
> >>>  \_ 5:0:1:16384 sdac 65:192 [active][ready]
> >>>  \_ 5:0:1:32768 sdas 66:192 [active][ready]
> >>>  \_ 5:0:1:49152 sdba 67:64  [active][ready]
> >>> \_ round-robin 0 [prio=4][enabled]
> >>>  \_ 3:0:1:0     sdaw 67:0   [active][ready]
> >>>  \_ 3:0:1:16384 sdbe 67:128 [active][ready]
> >>>  \_ 3:0:1:32768 sdbi 67:192 [active][ready]
> >>>  \_ 3:0:1:49152 sdbm 68:0   [active][ready]
> >>>
> >>> It think they fail since I see this messages from LVM:
> >>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> >>> vg_syb_roger-lv_syb_roger_admin
> >>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices
> >>> in
> >>> vg_syb_roger-lv_syb_roger_admin
> >>>
> >>> But from some reason LVM cant remove them, any option I should have on
> >>> lvm.conf?
> >>>
> >>> BestRegards
> >>> Jose
> >>>> post your multipath.conf file, you may be queuing forever ?
> >>>>
> >>>>
> >>>>
> >>>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
> >>>>> Hi2all
> >>>>>
> >>>>> I'm on RHEL 5.4 with
> >>>>> lvm2-2.02.46-8.el5_4.1
> >>>>> 2.6.18-164.2.1.el5
> >>>>>
> >>>>> I have a multipathed SAN connection with what Im builing LVs
> >>>>> Its a Cluster system, and I want LVs to switch on failure
> >>>>>
> >>>>> If I simulate a fail through the OS via
> >>>>> /sys/bus/scsi/devices/$DEVICE/delete
> >>>>> I get a LV fail and the service switch to other node
> >>>>>
> >>>>> But if I do it "real" portdown on the SAN Switch, multipath reports
> >>>>> path
> >>>>> down, but LVM commands hang forever and nothing gets switched
> >>>>>
> >>>>> from the logs i see multipath failing paths, and lvm Failed to remove
> >>>>> faulty
> >>>>> "devices"
> >>>>>
> >>>>> Any ideas how I should  "fix" it?
> >>>>>
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has
> >>>>> failed.
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
> >>>>> vg_ora_scapa-lv_ora_scapa_redo
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
> >>>>> event.  Waiting...
> >>>>>
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> >>>>> paths: 0
> >>>>>
> >>>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> >>>>> vg_syb_roger-lv_syb_roger_admin
> >>>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty
> >>>>> devices
> >>>>> in
> >>>>> vg_syb_roger-lv_syb_roger_admin
> >>>>>
> >>>>> Much Thanks
> >>>>> Jose
> >>>>>
> >>>>> _______________________________________________
> >>>>> linux-lvm mailing list
> >>>>> linux-lvm at redhat.com
> >>>>> https://www.redhat.com/mailman/listinfo/linux-lvm
> >>>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> linux-lvm mailing list
> >>>> linux-lvm at redhat.com
> >>>> https://www.redhat.com/mailman/listinfo/linux-lvm
> >>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>>>
> >>>
> >>>
> >>
> >> _______________________________________________
> >> linux-lvm mailing list
> >> linux-lvm at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-lvm
> >> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>
> >
> > --
> > Sent from my mobile device
> >
> > Regards,
> > Eugene Vilensky
> > evilensky at gmail.com
> >
> > _______________________________________________
> > linux-lvm mailing list
> > linux-lvm at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-lvm
> > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >
> 
> _______________________________________________
> linux-lvm mailing list
> linux-lvm at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/





More information about the linux-lvm mailing list