[linux-lvm] [lvmlockd] "VGLK res_unlock lm error -250" and lvm command hung forever

Damon Wang damon.devops at gmail.com
Wed May 30 13:17:31 UTC 2018


After days testing, I'm pretty sure the problem has been solved by
upgrade sanlock to 3.6.0, thanks Dave!

Damon

2018-05-25 0:50 GMT+08:00 Damon Wang <damon.devops at gmail.com>:
> Thank you for your reply!
>
> I'll try to sanlock-3.6.0 first (currently I'm using 3.5.0) and try
> whether it happen again
>
> Damon
>
> 2018-05-24 23:46 GMT+08:00 David Teigland <teigland at redhat.com>:
>> On Thu, May 24, 2018 at 10:44:05PM +0800, Damon Wang wrote:
>>> Hi all,
>>>
>>> I'm using lvmlockd + sanlock on iSCSI, and sometimes (usually
>>> intensive operations), it shows vglock is failed:
>>
>> Hi, thanks for this report.
>>
>>> /var/log/messages:
>>>
>>>     May 24 21:14:29 dev1 sanlock[1108]: 2018-05-24 21:14:29 605471
>>> [1112]: r627 paxos_release 8255 other lver 8258
>>
>> I believe this is the sanlock bug that was fixed here:
>> https://pagure.io/sanlock/c/735781d683e99cccb3be7ffe8b4fff1392a2a4c8?branch=master
>>
>> By itself, the bug isn't a big problem, the lock was released but sanlock
>> returns an error.  The bigger problem is that lvmlockd then believes that
>> the lock was not released:
>>
>>>     1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
>>> unlock_san release error -1
>>
>> so subsequent requests for the lock get backed up in lvmlockd:
>>
>>>     [root at dev1 ~]# lvmlockctl -i
>>>     LW VG sh ver 0 pid 34216 (lvchange)
>>>     LW VG sh ver 0 pid 75685 (lvs)
>>>     LW VG sh ver 0 pid 83741 (lvdisplay)
>>>     LW VG sh ver 0 pid 90569 (lvchange)
>>>     LW VG sh ver 0 pid 92735 (lvchange)
>>>     LW VG sh ver 0 pid 99982 (lvs)
>>>     LW VG sh ver 0 pid 14069 (lvchange)
>>
>>> My questions are:
>>>
>>> 1. why VGLK failed, is it because network failure(cause iSCSI fail and
>>> sanlock could not find VGLK volume), can I find a direct proof?
>>
>> I believe the bug.  Failures of the storage network can also cause similar
>> issues, but you would see error messages related to i/o timeouts.
>>
>>> 2. Is it recoverable? I have tried kill all hung commands but new
>>> command still hung forever.
>>
>> There are recently added options for this kind of situation, but I don't
>> believe there is an lvm release with those yet.
>>
>> If you are prepared to build your own version of lvm, build lvm release
>> 2.02.178 (which should be ready shortly, if it's not, take git master
>> branch).  Be sure to configure with --enable-lvmlockd-sanlock.  Then try:
>>
>>   lvchange -an --lockopt skipvg <vgname>
>>   lvmlockctl --drop <vgname>
>>   stop lvmlockd, stop sanlock
>>   restart everything as usual
>>
>> If that doesn't work, or if you don't want to build lvm, then unmount file
>> systems, kill lvmlockd, kill sanlock, you might need to do some dm cleanup
>> if LVs were active (or perhaps just reboot the machine.) Restart
>> everything as usual.
>>
>> Dave




More information about the linux-lvm mailing list