[libvirt] [PATCH] qemu: Fix shutdown regression
eblake at redhat.com
Tue Sep 20 18:19:37 UTC 2011
On 09/20/2011 12:06 PM, Dave Allan wrote:
> On Tue, Sep 20, 2011 at 07:39:15PM +0200, Jiri Denemark wrote:
>> The commit that prevents disk corruption on domain shutdown
>> (96fc4784177ecb70357518fa863442455e45ad0e) causes regression with QEMU
>> 0.14.* and 0.15.* because of a regression bug in QEMU that was fixed
>> only recently in QEMU git. With affected QEMU binaries, domains cannot
>> be shutdown properly and stay in a paused state. This patch tries to
>> avoid this by sending SIGKILL to 0.1.* QEMU processes. Though we
>> wait a bit more between sending SIGTERM and SIGKILL to reduce the
>> possibility of virtual disk corruption.
> IMO, SIGKILL should only be sent at the explicit direction of the
> user, saying in effect, I'm ok with possible data corruption, I want
> the VM killed unconditionally. I would rather leave VMs paused than
> risk corrupting data. Let's get as much input as we can from the qemu
> folks before we go down this path.
That re-echos my sentiment that qemu needs to tell us whether the bug is
fixed (we know that if version < 0.14, the bug is not present, and if
version > 0.15, the bug is fixed, but it is the 0.1 window where we
don't know if the vendor has back-ported the fix into the version of
qemu that we are targetting, unless we get some help from qemu).
I also wonder if we should make it so:
virDomainDestroy(dom) fails with a reasonable message, rather than
leaving the domain paused, if we think qemu has the bug, and require the
user to do virDomainDestroyFlags(dom, VIR_DOMAIN_DESTROY_FORCE) as the
means of the user explicitly requesting that they work around the qemu bug.
Eric Blake eblake at redhat.com +1-801-349-2682
Libvirt virtualization library http://libvirt.org
More information about the libvir-list