[libvirt] [Qemu-devel] [PATCH] qemu: Fix shutdown regression

Anthony Liguori anthony at codemonkey.ws
Tue Sep 20 20:12:59 UTC 2011


On 09/20/2011 02:03 PM, Eric Blake wrote:
> On 09/20/2011 12:52 PM, Anthony Liguori wrote:
>> On 09/20/2011 01:01 PM, Eric Blake wrote:
>>> On 09/20/2011 11:39 AM, Jiri Denemark wrote:
>>>> The commit that prevents disk corruption on domain shutdown
>>>> (96fc4784177ecb70357518fa863442455e45ad0e) causes regression with QEMU
>>>> 0.14.* and 0.15.* because of a regression bug in QEMU that was fixed
>>>> only recently in QEMU git. With affected QEMU binaries, domains cannot
>>>> be shutdown properly and stay in a paused state. This patch tries to
>>>> avoid this by sending SIGKILL to 0.1[45].* QEMU processes. Though we
>>>> wait a bit more between sending SIGTERM and SIGKILL to reduce the
>>>> possibility of virtual disk corruption.
>>>> ---
>>>> src/qemu/qemu_capabilities.c | 7 +++++++
>>>> src/qemu/qemu_capabilities.h | 1 +
>>>> src/qemu/qemu_process.c | 19 +++++++++++++------
>>>> 3 files changed, 21 insertions(+), 6 deletions(-)
>>>
>>> ACK. But it would be nice if upstream qemu could give us a more reliable
>>> indication of whether the qemu SIGTERM bug is fixed, so that we don't
>>> corrupt
>>> data on a patched 0.14 or 0.15 qemu.
>>
>> Can you be a lot more specific about what bug you mean?
>>
>
> https://bugzilla.redhat.com/show_bug.cgi?id=739895

That just got applied, last week, so no, it's not in any release right now.

>
>>> That is, as part of fixing the bug in qemu,
>>> we should also update -help text or something similar, so that libvirt
>>> can avoid
>>> making decisions solely on version numbers.
>>
>> The version number *is* the right way to make decisions. We've gone
>> through this dozens of times.
>>
>> The fact that distros backport all sorts of stuff means that you need to
>> maintain a matrix of versions with features. It's not our (upstream
>> QEMU's) responsibility to tell you the differences that exist in forks
>> of QEMU.
>
> Version numbers are lousy, precisely because they are not granular enough.
> That's why the autoconf philosophy frowns so heavily on version checks, and
> prefers feature checks instead.
>
> We want to know which features are present,

Features and bugs are different things.  I'm all for providing ways to detect 
whether we support certain commands in QMP, command line options, etc.

  not which versions introduced which
> features. In this case, we want to know about a particular feature (SIGTERM is
> not broken), which we know exists later than 0.15, but which might also exist as
> a backport in 0.14 or 0.15.

No, you want to know, does d9389b9664df561db796b18eb8309fffe58faf8b existing in 
this build of QEMU.  But makes d9389b more important than d296363 or db118fe72?

If you want to know whether a bug is fixed that is important to *you*, then you 
should check the git log correlating to that version and embed that info in 
libvirt.  Then libvirt is entirely empowered to deem whatever bug fixes you 
think are important to the table that you maintain.

> If qemu tells us that information, then upstream
> libvirt can make the decision correctly regardless of how distros backport the
> patch. But if qemu does not expose the information, then upstream libvirt must
> be pessimistic, and you've now forced the distros to do double-duty - they must
> backport both the qemu fix, and write a distro-specific libvirt patch that
> alters the version matrix to play with the distro build of qemu.

Or distros could use use the QEMU stable branch as their base and invest in 
backporting to QEMU stable instead of maintaining private backport trees.

Regards,

Anthony Liguori




More information about the libvir-list mailing list