[Linux-cluster] Cluster config. advice for sought

Thu Dec 17 07:41:13 UTC 2009

Dear all:

I am new to this list and cluster technology. Anyway, I managed to get a
cluster set up based on CentOS 5 with two nodes which worked very well
for several months.
Even several CentOS update rounds (all within version 5) worked flawlessly.
The cluster contains three paravirtualized Xen-based virtual machines in
an iSCSI storage vault. Even failover and failback worked perfectly.
Cluster control/management was handled by a separate standalone PC
running Conga.

Both cluster nodes and the adminpc are running CentOS5. After another
CentOS upgrade round in October, the cluster wouldn't start anymore. We
got that solved (cman would't start, but a newer openais package -
0.80.6 - let us overcome that by manual update), but now the virtual
machines always get started on all nodes simultaneously. Furthermore,
something in Conga setup also seems to have broken: The Conga
webinterface at the separate adminpc can still be accessed, but fails
when probing storage (broken ricci/luci communication?)
This never happened before the upgrade and we had changed neither
hardware nor software configuration during the update. Unfortunately, I
don't have access to the testing system anymore (but we *did* a lot of
testing before putting the system in production use).

I would appreciate if more experienced persons could review our
configuration and point out any errors or improvements:

The cluster has two nodes (station1, station2) and one standalone PC for
administration running Conga (adminpc). The nodes are standard Dell 1950
servers.
Main storage location is a Dell storage vault which is accessed via
iSCSI and mounted on both nodes as /rootfs/. The file system is GFS2.
Furthermore, it provides a quorum partition.
Fencing is handled via the included DRAC remote access boards.
There are three paravirtualized Xen-based virtual machines
(vm_mailserver, vm_ldapserver, vm_adminserver). Their container files 
are located at /rootfs/vmadminserver etc. The VMs are supposed to start
distributed on station1 (vm_mailserver) and station2 (vm_ldapserver,
vm_adminserver).

Software versions (identical on both nodes):
kernel 2.6.18-164.el5xen
openais-0.80.6-8.el5
cman-2.0.115-1.el5
rgmanager-2.0.52-1.el5.centos
xen-3.0.3-80.el5-3.3
xen-libs-3.0.3-80.el5-3.3
luci-0.12.1-7.3.el5.centos.1
ricci-0.12.1-7.3.el5.centos.1
gfs2-utils-0.1.62-1.el5

Before the CentOS update, the working cluster.conf was:
===quote nonworking cluster.conf===
<?xml version="1.0"?>
<cluster alias="example_cluster_1" config_version="81"
name="example_cluster_1">
    <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="30"/>
    <clusternodes>
        <clusternode name="station1.example.com" nodeid="1" votes="1">
            <fence>
                <method name="1">
                    <device name="station1_fenced"/>
                </method>
            </fence>
        </clusternode>
        <clusternode name="station2.example.com" nodeid="2" votes="1">
            <fence>
                <method name="1">
                    <device name="station2_fenced"/>
                </method>
            </fence>
        </clusternode>
    </clusternodes>
    <cman expected_votes="3" two_node="0"/>
    <fencedevices>
        <fencedevice agent="fence_ipmilan" ipaddr="172.16.10.91"
login="ipmi_admin" name="station1_fenced" operation="off" passwd="secret"/>
        <fencedevice agent="fence_ipmilan" ipaddr="172.16.10.92"
login="ipmi_admin" name="station2_fenced" operation="off" passwd="secret"/>
    </fencedevices>
    <rm>
        <failoverdomains>
            <failoverdomain name="bias-station1" nofailback="0"
ordered="0" restricted="0">
                <failoverdomainnode name="station1.example.com"
priority="1"/>
            </failoverdomain>
            <failoverdomain name="bias-station2" nofailback="0"
ordered="0" restricted="0">
                <failoverdomainnode name="station2.example.com"
priority="1"/>
            </failoverdomain>
        </failoverdomains>
        <resources/>
        <vm autostart="1" domain="bias-station1" exclusive="0"
migrate="live" name="vm_mailserver" path="/rootfs" recovery="restart"/>
        <vm autostart="1" domain="bias-station2" exclusive="0"
migrate="live" name="vm_ldapserver" path="/rootfs" recovery="restart"/>
        <vm autostart="1" domain="bias-station2" exclusive="0"
migrate="live" name="vm_adminserver" path="/rootfs" recovery="restart"/>
    </rm>
    <quorumd interval="3" label="xen_qdisk" min_score="1" tko="23"
votes="1"/>
</cluster>
===unquote nonworking cluster.conf===

A explained, this configuration worked flawlessly for 10 months.
Only after the CentOS update, it started the virtual machines
simultaneously on both station1 *and* station2 and not distributed as
per the <vm ...> directive. We temporarily worked arounf this problem by
changing the autostart parameter to <vm autostart="0" ...>.
At least this brought our cluster back to running, but we lost the
desired automatic restart should a system hang. And failover also
doesn't seem to work anymore.

I read several messages on this list where users seem to have had a
similar problem. It seems to me as if I had missed the use_virsh="0"
statement.

Hence my question: Is the following a valid cluster.conf for such a
setup (distributed VMs, automatic start, failover/failback):
===quote===
<?xml version="1.0"?>
<cluster alias="example_cluster_1" config_version="81"
name="example_cluster_1">
    <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="30"/>
    <clusternodes>
        <clusternode name="station1.example.com" nodeid="1" votes="1">
            <fence>
                <method name="1">
                    <device name="station1_fenced"/>
                </method>
            </fence>
        </clusternode>
        <clusternode name="station2.example.com" nodeid="2" votes="1">
            <fence>
                <method name="1">
                    <device name="station2_fenced"/>
                </method>
            </fence>
        </clusternode>
    </clusternodes>
    <cman expected_votes="3" two_node="0"/>
    <fencedevices>
        <fencedevice agent="fence_ipmilan" ipaddr="172.16.10.91"
login="ipmi_admin" name="station1_fenced" operation="off" passwd="secret"/>
        <fencedevice agent="fence_ipmilan" ipaddr="172.16.10.92"
login="ipmi_admin" name="station2_fenced" operation="off" passwd="secret"/>
    </fencedevices>
    <rm>
        <failoverdomains>
            <failoverdomain name="bias-station1" nofailback="0"
ordered="0" restricted="0">
                <failoverdomainnode name="station1.example.com"
priority="1"/>
            </failoverdomain>
            <failoverdomain name="bias-station2" nofailback="0"
ordered="0" restricted="0">
                <failoverdomainnode name="station2.example.com"
priority="1"/>
            </failoverdomain>
        </failoverdomains>
        <resources/>
        <vm autostart="1" use_virsh="0" domain="bias-station1"
exclusive="0" migrate="live" name="vm_mailserver" path="/rootfs"
recovery="restart"/>
        <vm autostart="1" use_virsh="0" domain="bias-station2"
exclusive="0" migrate="live" name="vm_ldapserver" path="/rootfs"
recovery="restart"/>
        <vm autostart="1" use_virsh="0" domain="bias-station2"
exclusive="0" migrate="live" name="vm_adminserver" path="/rootfs"
recovery="restart"/>
    </rm>
    <quorumd interval="3" label="xen_qdisk" min_score="1" tko="23"
votes="1"/>
</cluster>
===unquote===

I am open to further updates/testing and will gladly provide additional
details should if needed.
But as this setup also contains production systems, I want to avoid any
fundamental mistakes/oversights.

Needless to say, I would appreciate any feedback/suggestions!

Regards,
Wolf