[Linux-cluster] vm.sh: vm services depend on xend

Fri Jan 23 16:03:13 UTC 2009

Hi,

The service script 'vm.sh' gathers the vm service status using the 'xm' 
command, however 'xm' relies on xend for proper operation. If xend is 
down, bad things happen, up to destroying the VM.

I would have filed this issue with RH support, but I feel the solution 
to this problem requires some qualified thinking in the first place.

What happened:

(Environment: production 4 node Xen / RHEL 5.2 cluster running 30+ pv 
guests, Nagios monitoring, VM services configured to "Restart" failover)

a) xenconsoled died (this happens from time to time, monitored by Nagios).

b) Operations guy ran "service xend restart" to bring xenconsoled back 
up. The restart operation implies that xend is down for a short period 
of time.

c) rgmanager checked 3 VMs within the time frame xend was down. In vm.sh

xm list $OCF_RESKEY_name &> /dev/null

failed as xm could not communicate with xend. As a result rgmanager 
tried to stop and restart these 3 VMs. As the time frame without xend 
running has been quite short, xend was up again at the time rgmanager 
ran "vm.sh stop" on the 3 VMs, therefore the 3 VMs were shut down 
properly and came up afterwards.

This had been bad enough, but in fact we had been lucky, as I learned 
when replaying the issue in our test environment. A notable difference 
is that the test cluster is set to "Relocate" service recovery at the 
moment. I also had to shut down xend for the test, so it was down 
significantly longer than on the production cluster.

Background information on xend: xend is not required for Xen VMs to run, 
it is only required to control VMs. Restarting xend while VMs are 
running is a safe operation.

As a result of the longer xend downtime, "vm.sh stop" could not shut 
down the VM, as the stop operation again uses 'xm' to communicate with xend.

Afterwards rgmanager started the VM on another cluster node, where it 
came up perfectly  well.

But the VM has never been shut down on the cluster node not running 
xend. As a result the VM (which is installed on shared storage) was 
running twice on two different nodes and the ext3-filesystems had been 
mounted rw by both VM instances.

Any production server's filesystems would not have survived this for 
more than a couple of seconds. So there is the risk of severe damage 
here, especially as "relocate" is the default failover configuration.

As a workaround I propose to change xm.sh:

status()
{
+       xm info &> /dev/null || return 0
         xm list $OCF_RESKEY_name &> /dev/null
         if [ $? -eq 0 ]; then
                 return 0
         fi
         xm list migrating-$OCF_RESKEY_name &> /dev/null
         return $?
}

Though: this is not good enough. xend may vanish between 'xm info' and 
'xm list', leading to the described scenario.

Therefore xend should be a cluster service. The VM services would have 
to depend in the xend service. If a VM fails rgmanager would have to 
additionally check xend, and only act on the VM if xend has not failed 
and the VM fails a second test (xend may have just come up again, so we 
need to retest the VM).

If a VM has failed and it turns out that xend has failed as well, 
rgmanager should try to reactivate xend.

If xend cannot be started, the cluster node has to be fenced. As xend is 
not required for VMs to run, the VMs may be perfectly fine and must niot 
be restarted on another node unless they are guaranteed to be down.

Any comment is welcome.

best regards, Gunther

-- 
Gunther Schlegel
Manager IT Infrastructure

.............................................................
Riege Software International GmbH  Fon: +49 (2159) 9148 0
Mollsfeld 10                       Fax: +49 (2159) 9148 11
40670 Meerbusch                    Web: www.riege.com
Germany                            E-Mail: schlegel at riege.com
---                                ---
Handelsregister:                   Managing Directors:
Amtsgericht Neuss HRB-NR 4207      Christian Riege
USt-ID-Nr.: DE120585842            Gabriele  Riege
                                   Johannes  Riege
.............................................................
           YOU CARE FOR FREIGHT, WE CARE FOR YOU          

-------------- next part --------------
A non-text attachment was scrubbed...
Name: schlegel.vcf
Type: text/x-vcard
Size: 346 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20090123/5e961e6a/attachment.vcf>