[libvirt] [PATCH] qemu: fix state change lock held by remoteDispatchDomainBlockJobAbort forever
Jiri Denemark
jdenemar at redhat.com
Thu Sep 1 10:20:08 UTC 2016
On Thu, Sep 01, 2016 at 14:46:01 +0800, Xiubo Li wrote:
> When doing the snapshot using the script below:
> =========================================
> !#/bin/bash
> virsh blockjob $instance_name vdX --abort
> virsh undefine $instance_name
> qemu-img create -f qcow2 -o backing_file=$diskfile,size=$size $path/$uuid.dlta
> virsh blockcopy --domain $instance_name vdX $path/$uuid.dlta --wait --reuse-external --verbose
> virsh blockjob $instance_name vdX --abort
> qemu-img convert -f qcow2 -O qcow2 $path/$uuid.dlta $path/$uuid.qcow2
> echo "start uploading at $(date)"
> ......
> =========================================
>
> Sometimes you can encounter the following warning and error logs:
> +++++++++++++++++++++++++++++++++++++++++
> 2016-08-26 07:51:05.685+0000: 8916: warning : qemuDomainObjBeginJobInternal:1571 :
> Cannot start job (query, none) for domain instance-00001b3a; current job is (modify,
> none) owned by (8914 remoteDispatchDomainBlockJobAbort, 0 <null>) for (30s, 0s)
>
> 2016-08-26 07:51:05.685+0000: 8916: error : qemuDomainObjBeginJobInternal:1583 :
> Timed out during operation: cannot acquire state change lock (held by
> remoteDispatchDomainBlockJobAbort)
> -----------------------------------------
>
> Mostly it will be okay later. But sometimes the state change lock maybe hold by
> remoteDispatchDomainBlockJobAbort forever, then the instance couldn't be connected
> through the VNC or the network(ping,ssh...), expect after rebooting the instance.
>
> >From the code, we find that after sending the --abort command, there will be two
> steps by condition waiting the two reply signals:
> +++++++++++++++++++++++++++++++++++++++++
> The first condition wait is:
> In qemuMonitorBlockJobCancel() --> qemuMonitoJSONBlockJobCancel() -->
> qemuMonitorJSONCommand() --> qemuMonitorJSONCommandWithFd() -->
> qemuMonitorSend() --> virCondWait(&mon->notify, &mon->parent.lock).
>
> The second condition wait is:
> In virDomainCondWait(vm) --> virCondWait(&vm->cond, &vm->parent.lock).
> -----------------------------------------
>
> The two corresponding replies are:
> +++++++++++++++++++++++++++++++++++++++++
> The "return" reply is:
> QEMU_MONITOR_RECV_REPLY: mon=0x7ff3fc001020 reply={"return": [], "id": "libvirt-338"}
> With the condition wakeup signal:
> virCondBroadcast(&mon->notify)
>
> The "event" reply is:
> QEMU_MONITOR_RECV_EVENT: mon=0x7fe08401a3f0 event={"timestamp": {"seconds": 1472524518,
> "microseconds": 360135}, "event": "BLOCK_JOB_CANCELLED", "data": {"device":
> "drive-virtio-disk0", "len": 42949672960, "offset": 10485760, "speed": 0, "type": "mirror"}}
> With the condition wakeup signal:
> virDomainObjBroadcast(vm) --> virCondBroadcast(&vm->cond)
> -----------------------------------------
>
> But sometimes the qemu daemon will reply like:
> +++++++++++++++++++++++++++++++++++++++++
> The "event" reply is:
> QEMU_MONITOR_RECV_EVENT: mon=0x7fe08401a3f0 event={"timestamp": {"seconds": 1472524518,
> "microseconds": 360135}, "event": "BLOCK_JOB_CANCELLED", "data": {"device":
> "drive-virtio-disk0", "len": 42949672960, "offset": 10485760, "speed": 0, "type": "mirror"}}
> With the condition wakeup signal:
> virDomainObjBroadcast(vm) --> virCondBroadcast(&vm->cond)
>
> The "return" reply is:
> QEMU_MONITOR_RECV_REPLY: mon=0x7ff3fc001020 reply={"return": [], "id": "libvirt-338"}
> With the condition wakeup signal:
> virCondBroadcast(&mon->notify)
> -----------------------------------------
>
> In this case, when in the first condition wait, the received "event" reply & signal will be
> lost and still waiting for the "return" reply & signal. So the when in the second condition
> wait step, it will wait forever by holding the lock.
Nice analysis.
...
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index f9a3b15..6ebaf6b 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -15929,6 +15929,7 @@ qemuDomainBlockJobAbort(virDomainPtr dom,
> bool pivot = !!(flags & VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT);
> bool async = !!(flags & VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC);
> virDomainObjPtr vm;
> + unsigned long long now;
> int ret = -1;
>
> virCheckFlags(VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC |
> @@ -16018,7 +16019,12 @@ qemuDomainBlockJobAbort(virDomainPtr dom,
> qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk);
> qemuBlockJobUpdate(driver, vm, disk);
> while (diskPriv->blockjob) {
> - if (virDomainObjWait(vm) < 0) {
> + if (virTimeMillisNow(&now) < 0) {
> + ret = -1;
> + goto endjob;
> + }
> +
> + if (virDomainObjWaitUntil(vm, now + BLOCK_JOB_ABORT_TIMEOUT) < 0) {
> ret = -1;
> goto endjob;
> }
However, the patch is wrong. We should just never attempt to wait for
the event if we already got it. In other words, diskPriv->blockjob
should have already be false at this point.
Jirka
More information about the libvir-list
mailing list