[Linux-cluster] clurgmgrd - <err> #48: Unable to obtain cluster lock: Connectiontimed out

rhurst at bidmc.harvard.edu rhurst at bidmc.harvard.edu
Mon May 7 17:54:56 UTC 2007


What could cause clurgmgrd fail like this?  If clurgmgrd has a hiccup
like this, is it supposed to shutdown its services?  Is there something
in our implementation that could have prevented this from shutting down?

For unexplained reasons, we just had our CS service (WATSON) go down on
its own, and the syslog entry details the event as:

May  7 13:18:39 db1 clurgmgrd[17888]: <err> #48: Unable to obtain
cluster lock: Connection timed out 
May  7 13:18:41 db1 kernel: dlm: Magma: reply from 2 no lock
May  7 13:18:41 db1 kernel: dlm: reply
May  7 13:18:41 db1 kernel: rh_cmd 5
May  7 13:18:41 db1 kernel: rh_lkid 200242
May  7 13:18:41 db1 kernel: lockstate 2
May  7 13:18:41 db1 kernel: nodeid 0
May  7 13:18:41 db1 kernel: status 0
May  7 13:18:41 db1 kernel: lkid ee0388
May  7 13:18:41 db1 clurgmgrd[17888]: <notice> Stopping service WATSON 

... and its service entry looks like this:

<service autostart="0" domain="DB" exclusive="1" name="WATSON"
recovery="disable">
    <ip address="192.168.3.111" monitor_link="1"/>
    <fs device="/dev/VGWATSON/lvoldata" force_fsck="0" force_unmount="1"
fsid="53188" fstype="ext3" mountpoint="/watson-data"
name="WATSON-lvoldata" options="" self_fence="0">
        <fs device="/dev/VGWATSON/lvoldb1" force_fsck="0"
force_unmount="1" fsid="29524" fstype="ext3"
mountpoint="/watson-data/sys/db1" name="WATSON-lvoldb1" options=""
self_fence="0"/>
        <script file="/etc/init.d/WATSON" name="WATSON RC"/>
    </fs>
    <clusterfs ref="WATSON-lvol0">
        <clusterfs ref="WATSON-lvol1"/>
    </clusterfs>
</service>


Robert Hurst, Sr. Caché Administrator
Beth Israel Deaconess Medical Center
1135 Tremont Street, REN-7
Boston, Massachusetts   02120-2140
617-754-8754 ∙ Fax: 617-754-8730 ∙ Cell: 401-787-3154
Any technology distinguishable from magic is insufficiently advanced.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070507/8194de1e/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2178 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070507/8194de1e/attachment.p7s>


More information about the Linux-cluster mailing list