[libvirt] [PATCH] qemu: never send SIGKILL to qemu process unless specifically requested

Laine Stump laine at laine.org
Mon Jan 30 20:30:18 UTC 2012


On 01/30/2012 06:02 AM, Daniel P. Berrange wrote:
> On Fri, Jan 27, 2012 at 01:35:35PM -0500, Laine Stump wrote:
>> When libvirt is shutting down the qemu process, it first sends
>> SIGTERM, then waits for 1.6 seconds and, if it sees the process still
>> there, sends a SIGKILL.
>>
>> There have been reports that this behavior can lead to data loss
>> because the guest running in qemu doesn't have time to flush it's disk
>> cache buffers before it's unceremoniously whacked.
>>
>> One suggestion on how to solve that problem was to remove SIGKILL from
>> the normal virDomainDestroyFlags, but still provide the ability to
>> kill qemu with SIGKILL by using a new flag to virDomainDestroyFlags.
>> This patch is a quick attempt at that in order to start a
>> conversation on the topic.
>>
>> So what are your opinions? Is this the right way to solve the problem?
> No, we can't change the default semantics of virDomainDestroy in
> this case. Applications expect that we do absolutely everything
> possible to kill of the guest. This is particularly important for
> cluster fencing usage. If we only use SIGTERM, then we're introducing
> unacceptable risk to apps relying on this.
>
> We could do the opposite though - have a flag to do a gracefully
> destroy, leaving the default as un-graceful.

virDomainShutdown ends up calling qemuProcessKill() too. So, I guess we 
need to add a flag there too.

In the meantime, shouldn't we at least wait longer before resorting to 
SIGKILL? (especially since it appears the current timeout is quite often 
too short). (If we don't at least do that, what we're saying is "the 
behavior of virDomainShutdown / virDomainDestroy is to lose your data 
unless you're lucky. If you don't want this behavior, you need to use 
virDomainXXXFlags, and specify the VIR_DOMAIN_DONT_TRASH_MY_DATA flag" :-P).




More information about the libvir-list mailing list