[Linux-cluster] ccsd[846]: Error while processing get: No data available

Dascalu Dragos dascalu_dragos at bah.com
Fri Oct 22 18:37:33 UTC 2004


Hi,

We have been running a gfs cluster for a few months with no issues, 
however we are experiencing a problem with one of the nodes (web4). The 
summary of the behavior is that after running our startup scripts 
(broken into individual commands below) one of the nodes (web4) can not 
mount shared partitions (from a SAN). Looking at the logs it looks like 
"cman_tool join" throws some errors when invoked "ccsd[846]: Error 
while processing get: No data available". The configuration on this 
machine is identical to the other 4 machines which are functioning fine 
in the cluster. Any help is appreciated.

Below is the cluster.xml file we are currently using followed by the 
commands performed to get the cluster up and running.
---
<?xml version="1.0"?>
<cluster name="webserver" config_version="1">
   <cman>
   </cman>
   <nodes>
     <node name="web1" votes="1">
       <fence>
         <method name="single">
           <device name="brocade" port="2"/>
         </method>
       </fence>
     </node>
     <node name="web2" votes="1">
       <fence>
         <method name="single">
           <device name="brocade" port="3"/>
         </method>
       </fence>
     </node>
    <node name="web3" votes="1">
       <fence>
         <method name="single">
           <device name="brocade" port="4"/>
         </method>
       </fence>
     </node>
   <node name="web4" votes="1">
       <fence>
         <method name="single">
           <device name="brocade" port="7"/>
         </method>
       </fence>
    </node>
   <node name="drLog" votes="1">
       <fence>
         <method name="single">
           <device name="brocade" port="6"/>
         </method>
       </fence>
   </node>
  </nodes>
   <fence_devices>
     <device name="brocade" agent="fence_brocade" ipaddr="10.30.3.3" 
login="xxx" passwd="xxx"/>
   </fence_devices>
</cluster>
---

---START COMMANDS---

 >modprobe dm-mod
Oct 22 14:03:44 web4 kernel: device-mapper: 4.1.0-ioctl (2003-12-10) 
initialised: dm at uk.sistina.com
---
 >modprobe gfs
Oct 22 14:03:57 web4 kernel: Lock_Harness <CVS> (built Jul 29 2004 
15:01:24) installed
Oct 22 14:03:57 web4 kernel: GFS <CVS> (built Jul 29 2004 15:00:53) 
installed
---
 >modprobe lock_dlm
Oct 22 14:05:38 web4 kernel: CMAN <CVS> (built Jul 29 2004 15:03:55) 
installed
Oct 22 14:05:38 web4 kernel: NET: Registered protocol family 31
Oct 22 14:05:38 web4 kernel: DLM <CVS> (built Jul 29 2004 15:04:12) 
installed
Oct 22 14:05:38 web4 kernel: Lock_DLM (built Jul 29 2004 15:01:08) 
installed
---
 >lsmod
Module                  Size  Used by
lock_dlm               36888  0
dlm                   129336  1 lock_dlm
cman                  128608  2 lock_dlm,dlm
gfs                   280628  0
lock_harness            5916  2 lock_dlm,gfs
dm_mod                 44704  0
qla2300               123520  0
qla2xxx               117280  1 qla2300
scsi_transport_fc       5120  1 qla2xxx
ip_conntrack_irc       72852  0
ip_conntrack_ftp       73620  0
usb_storage            30976  1
ohci_hcd               19332  0
e1000                  83972  0
tg3                    87684  0
---
 >devmap_mknod.sh
Creating /dev/mapper/control character device with major:10 minor:63
---
 >ccsd
{not output or logs}
---
 >cman_tool join
Oct 22 14:14:25 web4 ccsd[846]: Error while processing get: No data 
available
Oct 22 14:14:25 web4 last message repeated 5 times
Oct 22 14:14:25 web4 kernel: CMAN: Waiting to join or form a 
Linux-cluster
Oct 22 14:14:26 web4 kernel: CMAN: sending membership request
Oct 22 14:14:26 web4 kernel: CMAN: got node web2
Oct 22 14:14:26 web4 kernel: CMAN: got node web1
Oct 22 14:14:26 web4 kernel: CMAN: got node web3
Oct 22 14:14:26 web4 kernel: CMAN: got node drLog
Oct 22 14:14:26 web4 kernel: CMAN: error sending ACK: -1 {sometimes 
this error does not show up}
Oct 22 14:14:27 web4 kernel: CMAN: quorum regained, resuming activity
---
 >clvmd
{no output or logs}
---
 >vgchange -aly
{no output or logs}
---
 >fence_tool join
{no output or logs}
---
 >cat /proc/cluster/nodes
Node  Votes Exp Sts  Name
    1    1    5   M   web4
    2    1    5   M   web3
    3    1    5   M   web1
    4    1    5   M   drLog
    5    1    5   M   web2
---
 >cat /proc/cluster/status
Version: 2.0.1
Config version: 1
Cluster name: webserver
Cluster ID: 56660
Membership state: Cluster-Member
Nodes: 5
Expected_votes: 5
Total_votes: 5
Quorum: 3
Active subsystems: 3
Node addresses: 10.20.4.254
---
 >mount /webserver
{terminal locks up and mount process can not be killed}

Thanks.
Dede Dascalu.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3399 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20041022/9c8444fa/attachment.p7s>


More information about the Linux-cluster mailing list