[Linux-cluster] Hassle with clvmd over external network

Sebastian Walter sebastian.walter at fu-berlin.de
Fri Jul 20 10:59:12 UTC 2007


Hello Patrick, thanks for your comment.

I can't see any messages regarding dlm in the syslog. When I call
`strace clvmd -d`, the first two lines are as followed:
[root at compute-0-1 ~]# strace clvmd -d
execve("/usr/sbin/clvmd", ["clvmd", "-d"], [/* 34 vars */]) = 0
uname({sys="Linux", node="compute-0-1.local", ...}) = 0
[...]

So, obviously, clvmd doesn't use the hostnames from cluster.conf (where
it should be c0-1.external.dns.com), but the ones provided by uname.
Does anybody know how to change this behaviour? Any help is greatly
appreciated!

Regards,
Sebastian


Patrick Caulfield wrote:
> Sebastian Walter wrote:
>   
>> Dear list,
>>
>> I'm trying to set up RHCS and GFS in a cluster which has two network
>> memberships, one is internal (eth0, 10.1.0.0/16, dns-names: host.local),
>> the other external (eth1, our real-world subnet and dns names). Every
>> cluster node has these two interfaces and related ip-numbers in both
>> networks.
>>
>> When I set up the RHCS and GFS on the local subnet, everything works
>> fine (ccsd, cman, clvmd, ... and also gfs mountable volumes). But if I
>> try to change cluster.conf to use the real-world addresses (I want to
>> use the gfs volumes also outside of the cluster), clvmd always makes
>> problems. I followed the faq and changed /etc/init.d/cman to connect
>> with -n host.external.dns.com. All hosts are in /etc/hosts. ccsd starts
>> well on all clients, as does cman and fenced. But when I try to start
>> the clvmd service on all nodes simultaneously, I get errors (Starting
>> clvmd: clvmd startup timed out).
>>
>> This is what my /proc gives me:
>> [root at dtm ~]# cat /proc/cluster/services
>> Service          Name                              GID LID State     Code
>> Fence Domain:    "default"                          11   2 run       -
>> [8 7 6 5 4 3 2 9 10 11 1]
>>
>> DLM Lock Space:  "clvmd"                            14   3 join     
>> S-6,20,11
>> [8 7 6 5 4 3 2 9 10 11 1]
>>
>> [root at compute-0-1 ~]# cat /proc/cluster/services
>> Service          Name                              GID LID State     Code
>> Fence Domain:    "default"                          11   2 run       -
>> [2 3 4 5 6 7 9 10 8 11 1]
>>
>> DLM Lock Space:  "clvmd"                            14   3 update    U-4,1,1
>> [2 3 4 5 6 7 8 9 10 11 1]
>>
>> (the second output I get from all the other nodes, I think it depends on
>> which host i start the service first on)
>>
>> Has anybody an idea how clvmd communicates to each other? cman is doing
>> fine... any other experiences? Thanks for any advice...
>>
>>     
>
>
> It'll be waiting for the DLM lockspace creation to complete on all nodes. Have a
> look in syslog for DLM messages.
>
>   




More information about the Linux-cluster mailing list