[linux-lvm] Lvm hangs on San fail

Bryan Whitehead driver at megahappy.net
Thu Apr 15 09:32:13 UTC 2010


Can you post the output of pvdisplay?

Also the output of multipath when the port is down?

If your multipath output is still showing all paths [active][ready]
when you shut a port down, you might need to change the path_checker
option. I don't have a Hitachi array but readsector0 (the default) did
not work for me, directio does. This could be LVM seeing IO is timing
out, but the multipath stuff isn't downing a dead path.


On Thu, Apr 15, 2010 at 1:29 AM, jose nuno neto <jose.neto at liber4e.com> wrote:
> GoodMornings
>
> This is what I have on multipath.conf
>
> blacklist {
>        wwid SSun_VOL0_266DCF4A
>        wwid SSun_VOL0_5875CF4A
>        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
>        devnode "^hd[a-z]"
> }
> defaults {
>                user_friendly_names             yes
> }
> devices {
>       device {
>                vendor                          "HITACHI"
>                product                         "OPEN-V"
>                path_grouping_policy            group_by_node_name
>                failback                        immediate
>                no_path_retry                   fail
>       }
>       device {
>                vendor                          "IET"
>                product                         "VIRTUAL-DISK"
>                path_checker                    tur
>                path_grouping_policy            failover
>                failback                        immediate
>                no_path_retry                   fail
>       }
> }
>
> As an example this is one LUN. It shoes [features=0] so I'd say it should
> fail right way
>
> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V      -SU
> [size=26G][features=0][hwhandler=0][rw]
> \_ round-robin 0 [prio=4][active]
>  \_ 5:0:1:0     sdu  65:64  [active][ready]
>  \_ 5:0:1:16384 sdac 65:192 [active][ready]
>  \_ 5:0:1:32768 sdas 66:192 [active][ready]
>  \_ 5:0:1:49152 sdba 67:64  [active][ready]
> \_ round-robin 0 [prio=4][enabled]
>  \_ 3:0:1:0     sdaw 67:0   [active][ready]
>  \_ 3:0:1:16384 sdbe 67:128 [active][ready]
>  \_ 3:0:1:32768 sdbi 67:192 [active][ready]
>  \_ 3:0:1:49152 sdbm 68:0   [active][ready]
>
> It think they fail since I see this messages from LVM:
> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> vg_syb_roger-lv_syb_roger_admin
> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices in
> vg_syb_roger-lv_syb_roger_admin
>
> But from some reason LVM cant remove them, any option I should have on
> lvm.conf?
>
> BestRegards
> Jose
>> post your multipath.conf file, you may be queuing forever ?
>>
>>
>>
>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
>>> Hi2all
>>>
>>> I'm on RHEL 5.4 with
>>> lvm2-2.02.46-8.el5_4.1
>>> 2.6.18-164.2.1.el5
>>>
>>> I have a multipathed SAN connection with what Im builing LVs
>>> Its a Cluster system, and I want LVs to switch on failure
>>>
>>> If I simulate a fail through the OS via
>>> /sys/bus/scsi/devices/$DEVICE/delete
>>> I get a LV fail and the service switch to other node
>>>
>>> But if I do it "real" portdown on the SAN Switch, multipath reports path
>>> down, but LVM commands hang forever and nothing gets switched
>>>
>>> from the logs i see multipath failing paths, and lvm Failed to remove
>>> faulty
>>> "devices"
>>>
>>> Any ideas how I should  "fix" it?
>>>
>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has failed.
>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
>>> vg_ora_scapa-lv_ora_scapa_redo
>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
>>> event.  Waiting...
>>>
>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>> paths: 0
>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>> paths: 0
>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>> paths: 0
>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>> paths: 0
>>>
>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>>> vg_syb_roger-lv_syb_roger_admin
>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices
>>> in
>>> vg_syb_roger-lv_syb_roger_admin
>>>
>>> Much Thanks
>>> Jose
>>>
>>> _______________________________________________
>>> linux-lvm mailing list
>>> linux-lvm at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>




More information about the linux-lvm mailing list