[Linux-cluster] Unexpected problems with clvmd

Shaun Mccullagh Shaun.Mccullagh at espritxb.nl
Wed Dec 3 13:57:00 UTC 2008


Hi Chrissie


Fence status is    ''none''    on all nodes

Shaun


for i in 10.0.154.10 pan5.tmf pan6.tmf pan4.tmf ; do ssh $i
/sbin/group_tool | grep fence; done
fence            0     default     00000000 none        
fence            0     default     00000000 none        
fence            0     default     00000000 none        
fence            0     default  00000000 none    

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Christine
Caulfield
Sent: woensdag 3 december 2008 14:53
To: linux clustering
Subject: Re: [Linux-cluster] Unexpected problems with clvmd

Shaun Mccullagh wrote:
> Hi,
> 
> I tried to add another node to our 3 node cluster this morning.
> 
> Initially things went well; but I wanted to check the new node booted 
> correctly.
> 
> After the second reboot clvmd failed to start up on the new node 
> (called
> pan4):
> 
> [root at pan4 ~]# clvmd -d1 -T20
> CLVMD[8e1e8300]: Dec  3 14:24:09 CLVMD started
> CLVMD[8e1e8300]: Dec  3 14:24:09 Connected to CMAN
> CLVMD[8e1e8300]: Dec  3 14:24:12 CMAN initialisation complete
> 
> Group_tool reports this output for clvmd on all four nodes in the 
> cluster
> 
> dlm              1     clvmd       00010005 FAIL_START_WAIT
> dlm              1     clvmd       00010005 FAIL_ALL_STOPPED
> dlm              1     clvmd       00010005 FAIL_ALL_STOPPED
> dlm              1     clvmd       00000000 JOIN_STOP_WAIT
> 
> Otherwise the cluster is OK:
> 
> [root at brik3 ~]# clustat
> Cluster Status for mtv_gfs @ Wed Dec  3 14:38:26 2008 Member Status: 
> Quorate
> 
>  Member Name                                               ID   Status
>  ------ ----                                               ---- ------
>  pan4                                                          4
Online
>  pan5                                                          5
Online
>  nfs-pan                                                       6
Online
>  brik3-gfs                                                     7
Online,
> Local
> 
> [root at brik3 ~]# cman_tool status
> Version: 6.1.0
> Config Version: 4
> Cluster Name: mtv_gfs
> Cluster Id: 14067
> Cluster Member: Yes
> Cluster Generation: 172
> Membership state: Cluster-Member
> Nodes: 4
> Expected votes: 4
> Total votes: 4
> Quorum: 3
> Active subsystems: 8
> Flags: Dirty
> Ports Bound: 0 11
> Node name: brik3-gfs
> Node ID: 7
> Multicast addresses: 239.192.54.42
> Node addresses: 172.16.1.60
> 
> It seems I have created a deadlock, what is the best way to fix this?
> 
> TIA
> 
>
The first thing is to check the fencing status, via group_tool and
syslog. If fencing hasn't completed then the DLM can't recover.
-- 

Chrissie

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster




Op dit e-mailbericht is een disclaimer van toepassing, welke te vinden is op http://www.espritxb.nl/disclaimer





More information about the Linux-cluster mailing list