[Linux-cluster] cLVM unusable on quorated cluster

Daniel Dehennin daniel.dehennin at baby-gnu.org
Fri Oct 3 14:35:36 UTC 2014


Hello,

I'm trying to setup pacemaker+corosync on Debian Wheezy to access a SAN
for an OpenNebula cluster.

As I'm new to cluster world, I have hard time figuring why sometime
things get really wrong and where I must look to find answers.

My OpenNebula frontend, running in a VM, does not manage to run the
resources and my syslog has a lot of:

#+begin_src
ocfs2_controld: Unable to open checkpoint "ocfs2:controld": Object does not exist
#+end_src

When this happens, other nodes have problem:

#+begin_src
root at nebula3:~# LANG=C vgscan
  cluster request failed: Host is down
  Unable to obtain global lock.
#+end_src

But things looks fin in “crm_mon”:

#+begin_src
root at nebula3:~# crm_mon -1
============
Last updated: Fri Oct  3 16:25:43 2014
Last change: Fri Oct  3 14:51:59 2014 via cibadmin on nebula1
Stack: openais
Current DC: nebula3 - partition with quorum
Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
5 Nodes configured, 5 expected votes
32 Resources configured.
============

Node quorum: standby
Online: [ nebula3 nebula2 nebula1 ]
OFFLINE: [ one ]

 Stonith-nebula3-IPMILAN    (stonith:external/ipmi):    Started nebula2
 Stonith-nebula2-IPMILAN    (stonith:external/ipmi):    Started nebula3
 Stonith-nebula1-IPMILAN    (stonith:external/ipmi):    Started nebula2
 Clone Set: ONE-Storage-Clone [ONE-Storage]
     Started: [ nebula1 nebula3 nebula2 ]
     Stopped: [ ONE-Storage:3 ONE-Storage:4 ]
 Quorum-Node    (ocf::heartbeat:VirtualDomain): Started nebula3
 Stonith-Quorum-Node   (stonith:external/libvirt):   Started nebula3
#+end_src

I don't know how to interpret dlm_tool informations:

#+begin_src
root at nebula3:~# dlm_tool ls -n
dlm lockspaces
name          CCB10CE8D4FF489B9A2ECB288DACF2D7
id            0x09250e49
flags         0x00000008 fs_reg
change        member 3 joined 1 remove 0 failed 0 seq 2,2
members       1189587136 1206364352 1223141568 
all nodes
nodeid 1189587136 member 1 failed 0 start 1 seq_add 1 seq_rem 0 check none
nodeid 1206364352 member 1 failed 0 start 1 seq_add 2 seq_rem 0 check none
nodeid 1223141568 member 1 failed 0 start 1 seq_add 1 seq_rem 0 check none

name          clvmd
id            0x4104eefa
flags         0x00000000 
change        member 3 joined 0 remove 1 failed 0 seq 4,4
members       1189587136 1206364352 1223141568 
all nodes
nodeid 1172809920 member 0 failed 0 start 0 seq_add 3 seq_rem 4 check none
nodeid 1189587136 member 1 failed 0 start 1 seq_add 1 seq_rem 0 check none
nodeid 1206364352 member 1 failed 0 start 1 seq_add 2 seq_rem 0 check none
nodeid 1223141568 member 1 failed 0 start 1 seq_add 1 seq_rem 0 check none
#+end_src

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dlm_tool-dump.txt
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20141003/96364a60/attachment.txt>
-------------- next part --------------

Is there any documentation on troubleshooting DLM/cLVM?

Regards.

-- 
Daniel Dehennin
Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 342 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20141003/96364a60/attachment.sig>


More information about the Linux-cluster mailing list