[Linux-cluster] problems with clvmd

Christine Caulfield ccaulfie at redhat.com
Mon Apr 18 14:26:34 UTC 2011


On 18/04/11 15:11, Terry wrote:
> On Mon, Apr 18, 2011 at 8:57 AM, Christine Caulfield
> <ccaulfie at redhat.com>  wrote:
>> On 18/04/11 14:38, Terry wrote:
>>>
>>> On Mon, Apr 18, 2011 at 3:48 AM, Christine Caulfield
>>> <ccaulfie at redhat.com>    wrote:
>>>>
>>>> On 17/04/11 21:52, Terry wrote:
>>>>>
>>>>> As a result of a strange situation where our licensing for storage
>>>>> dropped off, I need to join a centos 5.6 node to a now single node
>>>>> cluster.  I got it joined to the cluster but I am having issues with
>>>>> CLVMD.  Any lvm operations on both boxes hang.  For example, vgscan.
>>>>> I have increased debugging and I don't see any logs.  The VGs aren't
>>>>> being populated in /dev/mapper.  This WAS working right after I joined
>>>>> it to the cluster and now it's not for some unknown reason.  Not sure
>>>>> where to take this at this point.   I did find one weird startup log
>>>>> that I am not sure what it means yet:
>>>>> [root at omadvnfs01a ~]# dmesg | grep dlm
>>>>> dlm: no local IP address has been set
>>>>> dlm: cannot start dlm lowcomms -107
>>>>> dlm: Using TCP for communications
>>>>> dlm: connecting to 2
>>>>>
>>>>
>>>>
>>>> That message usually means that dlm_controld has failed to start. Try
>>>> starting the cman daemons (groupd, dlm_controld) manually with the -D
>>>> switch
>>>> and read the output which might give some clues to why it's not working.
>>>>
>>>> Chrissie
>>>>
>>>
>>>
>>> Hi Chrissie,
>>>
>>> I thought of that but I see dlm started on both nodes.  See right below.
>>>
>>>>> [root at omadvnfs01a ~]# ps xauwwww | grep dlm
>>>>> root      5476  0.0  0.0  24736   760 ?        Ss   15:34   0:00
>>>>> /sbin/dlm_controld
>>>>> root      5502  0.0  0.0      0     0 ?        S<        15:34   0:00
>>
>>
>> Well, that's encouraging in a way! But it's evidently not started fully or
>> the DLM itself would be working. So I still recommend starting it with -D to
>> see how far it gets.
>>
>>
>> Chrissie
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
> I think we had posts cross.  Here's my latest:
>
> Ok, started all the CMAN elements manually as you suggested.  I
> started them in order as in the init script. Here's the only error
> that I see.  I can post the other debug messages if you think they'd
> be useful but this is the only one that stuck out to me.
>
> [root at omadvnfs01a ~]# /sbin/dlm_controld -D
> 1303134840 /sys/kernel/config/dlm/cluster/comms: opendir failed: 2
> 1303134840 /sys/kernel/config/dlm/cluster/spaces: opendir failed: 2
> 1303134840 set_ccs_options 480
> 1303134840 cman: node 2 added
> 1303134840 set_configfs_node 2 10.198.1.111 local 0
> 1303134840 cman: node 3 added
> 1303134840 set_configfs_node 3 10.198.1.110 local 1
>

Can I see the whole set please ? It looks like dlm_controld might be 
stalled registering with groupd.

Chrissie




More information about the Linux-cluster mailing list