[libvirt] [Question]Libvirt doesn't care about qemu monitor event if fail to destroy qemu process

Michal Privoznik mprivozn at redhat.com
Mon Mar 5 09:50:01 UTC 2018


On 03/05/2018 10:39 AM, Wuzongyong (Euler Dept) wrote:
> 
> 
> Thanks,
> Zongyong Wu

[Please don't top post on technical lists]

> 
> 
>> -----Original Message-----
>> From: Michal Privoznik [mailto:mprivozn at redhat.com]
>> Sent: Monday, March 05, 2018 5:27 PM
>> To: Wuzongyong (Euler Dept) <cordius.wu at huawei.com>; libvir-
>> list at redhat.com
>> Cc: Wanzongshun (Vincent) <wanzongshun at huawei.com>; weijinfen
>> <weijinfen at huawei.com>
>> Subject: Re: [libvirt] [Question]Libvirt doesn't care about qemu monitor
>> event if fail to destroy qemu process
>>
>> On 03/05/2018 03:20 AM, Wuzongyong (Euler Dept) wrote:
>>> Hi,
>>>
>>> We unregister qemu monitor after sending QEMU_PROCESS_EVENT_MONITOR_EOF
>> to workerPool:
>>>
>>> static void
>>> qemuProcessHandleMonitorEOF(qemuMonitorPtr mon,
>>>                             virDomainObjPtr vm,
>>>                             void *opaque) {
>>>     virQEMUDriverPtr driver = opaque;
>>>     qemuDomainObjPrivatePtr priv;
>>> struct qemuProcessEvent *processEvent; ...
>>> processEvent->eventType = QEMU_PROCESS_EVENT_MONITOR_EOF;
>>>     processEvent->vm = vm;
>>>
>>>     virObjectRef(vm);
>>>     if (virThreadPoolSendJob(driver->workerPool, 0, processEvent) < 0) {
>>>         ignore_value(virObjectUnref(vm));
>>>         VIR_FREE(processEvent);
>>>         goto cleanup;
>>>     }
>>>
>>>     /* We don't want this EOF handler to be called over and over while
>> the
>>>      * thread is waiting for a job.
>>>      */
>>> qemuMonitorUnregister(mon);
>>> ...
>>> }
>>>
>>> Then we handle QEMU_PROCESS_EVENT_MONITOR_EOF in processMonitorEOFEvent
>> function:
>>>
>>> static void
>>> processMonitorEOFEvent(virQEMUDriverPtr driver,
>>>                        virDomainObjPtr vm) {
>>>       ...
>>>       if (qemuProcessBeginStopJob(driver, vm, QEMU_JOB_DESTROY, true) <
>> 0)
>>>         return;
>>>       ...
>>> }
>>>
>>> Here,  libvirt will show that the vm state is running all the time if
>>> qemuProcessBeginStopJob return -1 even though qemu may terminate or be
>> killed later.
>>>
>>> So, may be we should re-register the monitor when
>> qemuProcessBeginStopJob failed?
>>
>> The fact that processMonitorEOFEvent() failed to grab DESTROY job means
>> that we screwed up earlier and now you're just seeing effects of it.
>> Threads should be albe to acquire DESTROY job at any point, regardless of
>> other jobs set on the domain object.
>>
>> Can you please:
>> a) try to turn on debug logs [1] and tell us why acquiring DESTROY job
>> failed? You should see an error message like this:
>>
>>   error: cannot acquire state change lock ..
>>
>> b) tell us what is your libvirt version and if you're able to reproduce
>> this with the latest git HEAD?
>>
> 
> I said " qemuProcessBeginStopJob failed" means that:

Oh, I though that the message you've sent earlier is related to this:

https://www.redhat.com/archives/libvir-list/2018-March/msg00148.html

So you are not accidentally sending SIGKILL to qemu then?

> we failed to kill qemu process in 15 seconds (refer to virProcessKillPainfully).
> IOW, we send SIGTERM and SIGKILL but the qemu process doesn't exit in 15s, and
> then libvirt will think qemu is still in running state event though qemu  exit
> indeed after the 15s loop in virProcessKillPainfully.

What state is qemu process in then? I mean, how can we see EOF if the
process still exists?


Michal




More information about the libvir-list mailing list