[Linux-cluster] CentOS + Conga: luci shows incorrect service status + xen vm service fails

Wed Aug 13 13:49:51 UTC 2008

Hello,

I am currently experimenting with Conga on CentOS 5.2 with two cluster nodes 
(centos1, centos2) and a management machine (centos0).

I created a service(service1) with a script resource. But when I try to start 
the service through the luci webinterface on centos0 the service is always 
shown as stopped afterwards but the service is running and I can control it 
through clusvcadm on the nodes. Also running clustat on the nodes confirms 
the service is running.

I also tried to create a virtual service (mpevm1). For this I created a xen vm 
with a config file and disk file on a nfs mount accessible by both nodes and 
added it as a virtual service to the cluster. But when I try to start the 
service it fails. 
/var/log/messages shows:
Aug 13 15:32:50 centos2 clurgmgrd[17230]: <notice> Starting stopped service 
vm:mpevm1
Aug 13 15:32:50 centos2 clurgmgrd[17230]: <notice> start on vm "mpevm1" 
returned 1 (generic error)
Aug 13 15:32:50 centos2 clurgmgrd[17230]: <warning> #68: Failed to start 
vm:mpevm1; return value: 1
Aug 13 15:32:50 centos2 clurgmgrd[17230]: <notice> Stopping service vm:mpevm1
Aug 13 15:32:56 centos2 clurgmgrd[17230]: <notice> Service vm:mpevm1 is 
recovering

The same if I try to start it using clusvcadm. Xend is configured for 
relocation and manually starting the xen vm with "xm create" works. Also 
migrating with "xm migrate" is successful. It just doesn't work using the 
conga tools.

Does anyone have any ideas on these two problems?

Greetings
Sebastian Woehrl

PS: My cluster.conf:

<?xml version="1.0"?>
<cluster alias="cluster1" config_version="21" name="cluster1">
<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
    <clusternode name="centos1.mpe.mpg.de" nodeid="1" votes="1">
        <fence/>
    </clusternode>
    <clusternode name="centos2.mpe.mpg.de" nodeid="2" votes="1">
        <fence/>
    </clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices/>
<rm>
    <failoverdomains>
        <failoverdomain name="failover1" nofailback="0" ordered="1" 
restricted="1">
           <failoverdomainnode name="centos1.mpe.mpg.de" priority="1"/>
           <failoverdomainnode name="centos2.mpe.mpg.de" priority="2"/>
        </failoverdomain>
    </failoverdomains>
    <resources/>
    <service autostart="0" domain="failover1" exclusive="1" name="service1" 
recovery="relocate">
        <script file="/shell0" name="shell0"/>
    </service>
    <vm autostart="0" domain="failover1" exclusive="1" migrate="live" 
name="mpevm1" path="/var/xen/mpevm1" recovery="relocate"/>
</rm>
</cluster>