[dm-devel] Unable to deactivate lv, pehaps due to semaphore problem...

Gianluca Cecchi gianluca.cecchi at gmail.com
Thu Nov 27 15:01:53 UTC 2014


On Thu, Nov 27, 2014 at 3:33 PM, Zdenek Kabelac <zkabelac at redhat.com> wrote:

> Dne 27.11.2014 v 15:26 Gianluca Cecchi napsal(a):
>
>> Hello,
>> I'm unable to deactivate an lvm.
>>
>> My system is RHEL 6.5 with lvm2-2.02.100-8.el6.x86_64 and kernel
>> 2.6.32-431.29.2.el6.x86_64
>>
>> I get error code 5 with message
>>    Logical volume VG_AAA_TEMP/LV_AAA_TEMP in use.
>>
>> You can find output of
>> lvchange -d -d -d -d -d -d -an VG_AAA_TEMP/LV_AAA_TEMP
>> here:
>> https://drive.google.com/file/d/0BwoPbcrMv8mvTjlBMkRUbG9nczA/
>> view?usp=sharing
>>
>>
> Not really accessible.
>

strange, do you mean the google docs link?
I tried with a browser without access to any gmail account and I'm able to
download it....


>
> But anyway - if you have problem with  'semaphore' resouces - you could
> 'recycle' old ones -
>
> 'dmsetup  udevcomplete_all'
>

This is actually a production server with many other LVs... Is there any
drawback in the command above?


>
> Of course it's hard to guess what experiments are you doing and would
> could lead to uncompleted cockies (stuck udev scans)
>

Actually no experiment at all.
The node is part of a rhel 2-nodes production cluster with HA_LVM based
services.
We need to relocate many services to the other node for a planned
maintenance, but it seems that this one is able to stop the lvm resources,
but not cleanly deactivate the LVs. We get messages like

Nov 26 17:35:29 orapr2 rgmanager[5765]: [lvm] Deactivating
VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:29 orapr2 rgmanager[5786]: [lvm] Making resilient : lvchange
-an VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:29 orapr2 rgmanager[5809]: [lvm] Resilient command: lvchange
-an VG_AAA_TEMP/LV_AAA_TEMP --config devices{filter=["a|/dev/ma
pper/360a9800037543544465d424
Nov 26 17:35:34 orapr2 rgmanager[5883]: [lvm] lv_exec_resilient failed
Nov 26 17:35:34 orapr2 rgmanager[5908]: [lvm] lv_activate_resilient stop
failed on VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5928]: [lvm] Unable to deactivate
VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5948]: [lvm] Failed to stop
VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5968]: [lvm] Attempting cleanup of
VG_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5989]: [lvm] VG_AAA_TEMP now consistent
Nov 26 17:35:34 orapr2 rgmanager[6013]: [lvm] Deactivating
VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[6033]: [lvm] Making resilient : lvchange
-an VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:35 orapr2 rgmanager[6056]: [lvm] Resilient command: lvchange
-an VG_AAA_TEMP/LV_AAA_TEMP --config devices{filter=["a|/dev/ma
pper/360a9800037543544465d424
Nov 26 17:35:39 orapr2 rgmanager[6648]: [lvm] lv_exec_resilient failed
Nov 26 17:35:40 orapr2 rgmanager[6670]: [lvm] lv_activate_resilient stop
failed on VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:40 orapr2 rgmanager[6690]: [lvm] Unable to deactivate
VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:40 orapr2 rgmanager[6710]: [lvm] Failed second attempt to stop
VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:40 orapr2 rgmanager[20260]: stop on lvm "LV_AAA_TEMP" returned
1 (generic error)
Nov 26 17:35:40 orapr2 rgmanager[20260]: Marking service:AAA as 'disabled',
but some resources may still be allocated!
Nov 26 17:35:40 orapr2 rgmanager[20260]: Service service:AAA is disabled

And of course the other node then is unable to activate the service due to
LV maintained open from the first one:

Nov 26 17:35:40 orapr1 rgmanager[18596]: Starting disabled service
service:AAA
Nov 26 17:35:41 orapr1 rgmanager[31420]: [lvm] Someone else owns this
logical volume
Nov 26 17:35:41 orapr1 rgmanager[18596]: start on lvm "LV_AAA_TEMP"
returned 1 (generic error)
Nov 26 17:35:41 orapr1 rgmanager[18596]: #68: Failed to start service:AAA;
return value: 1

So I'm trying to reproduce the cluster command to see how to clean the
situation, using this particular service (named AAA) that is not critical
as the other ones running on the node


> Do you happen to have some suspend devices in your table ?
> (dmsetup info -c    should show them)


It seems not so. Only (L)ive states...

[root at orapr2 ~]# dmsetup info -c | awk '{print $4}' | sort | uniq -c
     77 L--w
      1 Stat



>
>
>  [root at orapr2 ~]# lvs VG_AAA_TEMP/LV_AAA_TEMP
>>    LV          VG          Attr       LSize    Pool Origin Data%  Move Log
>> Cpy%Sync Convert
>>    LV_AAA_TEMP VG_AAA_TEMP -wi-ao---- 1020.00m
>>
>> How can I see the responsible for the reference that apparently keeps it
>> open?
>>
>> Open count:        1
>> so I can check and eventually fix??
>>
>>
> dmsetup ls --tree
>
> is usually good in shows deps between devs (i.e.  target A holds target B)
>
> Regards
>
> Zdenek
>
>
>
it returns no particular output related
...
 VG_AAA_TEMP-LV_AAA_TEMP (253:49)
 └─360a9800037543544465d424130533177 (253:4)
    ├─ (130:128)
    ├─ (129:32)
    ├─ (68:48)
    ├─ (8:96)
    ├─ (8:288)
    ├─ (133:224)
    ├─ (69:192)
    └─ (66:160)
...

BTW: I'm testing this one but it seems that the problem is general, in the
sense that each LV gets this kind of behaviour trying to deactivating it...

Thanks in advance for any other insight and let me know if I can send it
the debug log of lvchange command in case you are not yet able to access
it...

Gianluca
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20141127/028bb291/attachment.htm>


More information about the dm-devel mailing list