[Linux-cluster] DLM in recover state - node can't connect to cluster
Maciej Bogucki
maciej.bogucki at artegence.com
Thu Aug 16 14:16:53 UTC 2007
Hello,
I have five node cluster. Node05 failed(kernel panic), and fencing
failed. When I rebooted failed node05, it can't connect to cluster and
filesystem is locked, because it is in the recover state. I need to
reboot all nodes to recover cluster.
On node05 I get "fenced: startup failed"
Here is the output form another node in cluster:
---cut---
[root at node03 ~]# cat /proc/cluster/services
Service Name GID LID State Code
Fence Domain: "default" 1 2 run
U-1,10,1
[2 3 5 4]
DLM Lock Space: "clvmd" 2 3 run
U-1,10,1
[2 3 5 4]
DLM Lock Space: "repository" 3 4 recover 2 -
[2 3 5 4]
GFS Mount Group: "repository" 4 5 recover 0 -
[2 3 5 4]
[root at node03 ~]#
---cut---
What does mean "U-1,10,1"?
Here is some information form cluster.conf
---cut---
<fence_daemon clean_start="1" post_fail_delay="0" post_join_delay="3"/>
<cman expected_votes="3" deadnode_timeout="120" hello_timer="10"/>
---cut---
I don't have the latest cman, fence, dlm, and kernel, so maybe it is a
problem?
cman-1.0.11-0
fence-1.32.25-1
dlm-1.0.1-1
kernel-smp-2.6.9-42.0.3.EL
Best Regards
Maciej Bogucki
More information about the Linux-cluster
mailing list