[Linux-cluster] VKM guest managed by cluster

Fri Apr 23 13:13:58 UTC 2010

Hi,
I have discovered what was I doing wrong... at the <vm definition, vm.sh 
checks if "path" parameter is set, and if it's set, it uses "xm" instead 
"virsh" (no matter if you have use_virsh="1").

I discovered that "xm" was being used by testing the service using 
"rg_test":
"rg_test test /etc/cluster/cluster.conf start service guest00_service"

Sorry for the question/self-answer loop.

Alex

On 04/23/2010 11:57 AM, Alex Re wrote:
> Hi!
> I have been trying to get a KVM guest running as a clustered service 
> (two node cluster with GFS2 shared images), in order to restart the 
> guest on the alive cluster node, in case the other node crashes. The 
> problem is that I can't get the VM service managed by the cluster 
> daemons (manually start/stop/live migrate my VM guest works fine).
> This is how my "cluster.conf" file looks like:
>
> <?xml version="1.0"?>
> <cluster config_version="16" name="KVMCluster">
> <fence_daemon post_fail_delay="0" post_join_delay="10"/>
> <clusternodes>
> <clusternode name="nodeAint" nodeid="1" votes="1">
> <multicast addr="239.0.01" interface="eth2"/>
> <fence>
> <method name="single">
> <device name="nodeA_ilo"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="nodeBint" nodeid="2" votes="1">
> <multicast addr="239.0.0.1" interface="eth2"/>
> <fence>
> <method name="single">
> <device name="nodeB_ilo"/>
> </method>
> </fence>
> </clusternode>
> </clusternodes>
> <quorumd interval="1" label="QuoDisk" tko="10" votes="1"/>
> <cman expected_votes="3" two_node="0">
> <multicast addr="239.0.0.1"/>
> </cman>
> <fencedevices>
> <fencedevice agent="fence_ilo" hostname="nodeAcn" login="hp" 
> name="nodeA_ilo" passwd="hpinvent"/>
> <fencedevice agent="fence_ilo" hostname="nodeBcn" login="hp" 
> name="nodeB_ilo" passwd="hpinvent"/>
> </fencedevices>
> <rm log_level="7">
> <failoverdomains>
> <failoverdomain name="FD1" ordered="0" restricted="0">
> <failoverdomainnode name="nodeAint" priority="1"/>
> <failoverdomainnode name="nodeBint" priority="1"/>
> </failoverdomain>
> </failoverdomains>
> <service autostart="1" exclusive="0" domain="FD1" 
> name="guest00_service" recovery="relocate">
> <vm domain="FD1" autostart="1" migrate="live" use_virsh="1" 
> hypervisor="qemu" name="guest00" hypervisor_uri="qemu+ssh:///system" 
> path="/etc/libvirt/qemu/guest00.xml">
> </vm>
> </service>
> <resources/>
> </rm>
> <dlm plock_ownership="1" plock_rate_limit="0"/>
> <gfs_controld plock_rate_limit="0"/>
> </cluster>
>
> And these are the errors I'm getting at syslog:
> Apr 23 11:28:44 nodeB clurgmgrd[5490]: <notice> Resource Group Manager 
> Starting
> Apr 23 11:28:44 nodeB clurgmgrd[5490]: <info> Loading Service Data
> Apr 23 11:28:45 nodeB clurgmgrd[5490]: <info> Initializing Services
> Apr 23 11:28:45 nodeB clurgmgrd: [5490]: <crit> xend/libvirtd is dead; 
> cannot stop guest00
> Apr 23 11:28:45 nodeB clurgmgrd[5490]: <notice> stop on vm "guest00" 
> returned 1 (generic error)
> Apr 23 11:28:45 nodeB clurgmgrd[5490]: <info> Services Initialized
> Apr 23 11:28:45 nodeB clurgmgrd[5490]: <info> State change: Local UP
> Apr 23 11:28:51 nodeB clurgmgrd[5490]: <notice> Starting stopped 
> service service:guest00_service
> Apr 23 11:28:51 nodeB clurgmgrd[5490]: <notice> start on vm "guest00" 
> returned 127 (unspecified)
> Apr 23 11:28:51 nodeB clurgmgrd[5490]: <warning> #68: Failed to start 
> service:guest00_service; return value: 1
> Apr 23 11:28:51 nodeB clurgmgrd[5490]: <notice> Stopping service 
> service:guest00_service
> Apr 23 11:28:51 nodeB clurgmgrd: [5490]: <crit> xend/libvirtd is dead; 
> cannot stop guest00
> Apr 23 11:28:51 nodeB clurgmgrd[5490]: <notice> stop on vm "guest00" 
> returned 1 (generic error)
> Apr 23 11:28:51 nodeB clurgmgrd[5490]: <crit> #12: RG 
> service:guest00_service failed to stop; intervention required
> Apr 23 11:28:51 nodeB clurgmgrd[5490]: <notice> Service 
> service:guest00_service is failed
> Apr 23 11:28:51 nodeB clurgmgrd[5490]: <crit> #13: Service 
> service:guest00_service failed to stop cleanly
>
> I have checked the status of the libvirtd daemon, and it's running fine:
> [root at nodeB ~]# service libvirtd status
> libvirtd (pid  5352) is running...
>
> And all VM guests management using "virsh" is also running fine.
> I'm using: "cman-2.0.115-1.el5_4.9", 
> "rgmanager-2.0.52-1.el5.centos.2", "libvirt-0.6.3-20.1.el5_4"
>
> I'm missing something on the "cluster.conf"??? Or at the libvirtd daemon?¿
> Thanks for your help!
>
> Alex.
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100423/b948cc4f/attachment.htm>