[Linux-cluster] Re: [Cluster-devel] Bug on dlm

Patrick Caulfield pcaulfie at redhat.com
Fri Sep 28 13:51:19 UTC 2007


Jordi Prats wrote:
> Hi,
> This bug could be causing this?
> 
> 
> [root at inf17 ~]# clustat
> Member Status: Inquorate
> 
>  Member Name                        ID   Status
>  ------ ----                        ---- ------
>  inf17                                 1 Online, Local
>  inf18                                 2 Offline
>  inf19                                 3 Offline
> 
> 
> [root at inf18 ~]# clustat
> Member Status: Quorate
> 
>  Member Name                        ID   Status
>  ------ ----                        ---- ------
>  inf17                                 1 Online
>  inf18                                 2 Online, Local
>  inf19                                 3 Offline
> 
> 
> [root at inf17 ~]# group_tool
> type             level name       id       state
> fence            0     default    00010001 JOIN_START_WAIT
> [1]
> dlm              1     rgmanager  00020001 JOIN_ALL_STOPPED
> [1]
> 
> [root at inf18 ~]# group_tool
> type             level name       id       state
> fence            0     default    00000000 JOIN_STOP_WAIT
> [1 2]
> dlm              1     rgmanager  00010002 JOIN_START_WAIT
> [2]


No, that's misconfgured fencing.

> [root at inf17 ~]# cman_tool status
> Version: 6.0.1
> Config Version: 4
> Cluster Name: boumort
> Cluster Id: 13356
> Cluster Member: Yes
> Cluster Generation: 3824
> Membership state: Cluster-Member
> Nodes: 1
> Expected votes: 2
> Total votes: 1
> Quorum: 2 Activity blocked
> Active subsystems: 7
> Flags:
> Ports Bound: 0
> Node name: inf17
> Node ID: 1
> Multicast addresses: 239.192.52.96
> Node addresses: 192.168.22.17
> 
> 
> [root at inf18 ~]# cman_tool status
> Version: 6.0.1
> Config Version: 4
> Cluster Name: boumort
> Cluster Id: 13356
> Cluster Member: Yes
> Cluster Generation: 3820
> Membership state: Cluster-Member
> Nodes: 2
> Expected votes: 2
> Total votes: 2
> Quorum: 2
> Active subsystems: 7
> Flags:
> Ports Bound: 0 177
> Node name: inf18
> Node ID: 2
> Multicast addresses: 239.192.52.96
> Node addresses: 192.168.22.18
> 
> 
> Patrick Caulfield wrote:
>> Jordi Prats wrote:
>>> Hi,
>>> I've found this while starting my server. It's a F7 with the latest
>>> version avaliable.
>>>
>>> Hope this helps :)
>>>
>>> Jordi
>>>
>>> Jul 26 23:52:51 inf18 kernel: dlm: rgmanager: recover 1
>>> Jul 26 23:52:51 inf18 kernel: dlm: rgmanager: add member 2
>>> Jul 26 23:52:51 inf18 kernel: dlm: rgmanager: total members 1 error 0
>>> Jul 26 23:52:51 inf18 kernel: dlm: rgmanager: dlm_recover_directory
>>> Jul 26 23:52:51 inf18 kernel: dlm: rgmanager: dlm_recover_directory 0
>>> entries
>>> Jul 26 23:52:51 inf18 kernel:
>>> Jul 26 23:52:51 inf18 kernel: =====================================
>>> Jul 26 23:52:51 inf18 kernel: [ BUG: bad unlock balance detected! ]
>>> Jul 26 23:52:51 inf18 kernel: -------------------------------------
>>> Jul 26 23:52:51 inf18 kernel: dlm_recoverd/2963 is trying to release
>>> lock (&ls->ls_in_recovery) at:
>>> Jul 26 23:52:51 inf18 kernel: [<ee67b874>] dlm_recoverd+0x265/0x433
>>> [dlm]
>>> Jul 26 23:52:51 inf18 kernel: but there are no more locks to release!
>>> Jul 26 23:52:51 inf18 kernel:
>>
>> Yeah, we know about it. It's not actually a bug, just the lockdep
>> checking code
>> being a little over-enthusiastic. Unfortunately there aren't any
>> annotations
>> available to make it quiet either.
>>
>> The trick is to live with it, or to use kernels that have a little less
>> debugging compiled in, which you would want to do for production
>> anyway :)
>>
>>
>> Patrick
>>
>>
> 


-- 
Patrick

Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street,
Windsor, Berkshire, SL4 ITE, UK.
Registered in England and Wales under Company Registration No. 3798903




More information about the Linux-cluster mailing list