[Linux-cluster] Virtual services won't start.

Hunt, Gary Gary_Hunt at gallup.com
Wed Feb 11 17:22:03 UTC 2009


I am having an issue with our 2 node cluster, hoping someone has seen this before.

2 node cluster with a quorum disk RHEL 5.3 is the OS

I took each node down for some maintenance.  After the reboot I couldn't get luci to start any of the virtual servers on either node.  It said "cluster service manager is not running"
Tried to reboot nodes to see if that would help; it would hang on "Waiting for services to stop"  I had to issue a second reboot command to get the server down.

Out of the blue I decided to remove the quorum disk and things started working again.  I added the quorum disk back in and it is still working.  This is the second time that this has happened.  We were operational for a week or so each time with several failover tests each time without issue.

Both times the cluster seemed happy and generated no errors on startup.  Any insight would be greatly appreciated.  My cluster.conf is posted below.


<?xml version="1.0"?>

<cluster alias="xencluster" config_version="40" name="xencluster">

        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>

        <clusternodes>

                <clusternode name="ricci1b.gallup.com" nodeid="1" votes="1">

                        <fence>

                                <method name="1">

                                        <device name="ricci1b"/>

                                </method>

                        </fence>

                </clusternode>

                <clusternode name="ricci2b.gallup.com" nodeid="2" votes="1">

                        <fence>

                                <method name="1">

                                        <device name="ricci2b"/>

                                </method>

                        </fence>

                </clusternode>

        </clusternodes>

        <cman expected_votes="3" two_node="0"/>

        <fencedevices>

                <fencedevice agent="fence_ipmilan" ipaddr="172.30.3.110" login="xxxx" name="ricci1b" passwd="xxxxxx"/>

                <fencedevice agent="fence_ipmilan" ipaddr="172.30.3.140" login="xxxx" name="ricci2b" passwd="xxxxxx"/>

        </fencedevices>

        <rm>

                <failoverdomains/>

                <resources/>

                <vm autostart="1" exclusive="0" name="rhel_full" path="/xenconfigs" recovery="restart"/>

                <vm autostart="1" exclusive="0" name="rhel_para" path="/xenconfigs" recovery="restart"/>

        </rm>

        <quorumd interval="5" label="quorum_disk_from_ricci1" min_score="1" tko="3" votes="1"/> </cluster>



Thanks



Gary

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20090211/d7b3451a/attachment.htm>


More information about the Linux-cluster mailing list