[libvirt] RFC: Exposing "ready" bool (of `query-block-jobs`) or QMP BLOCK_JOB_READY event

Kashyap Chamarthy kchamart at redhat.com
Thu Oct 6 10:26:50 UTC 2016


Backround
---------

For QEMU block device jobs, the "ready" boolean field (part of QMP
`query-block-jobs`) was introduced in commit ef6dbf1 (available in QEMU
v2.2.0 or above):

    http://git.qemu.org/?p=qemu.git;a=commitdiff;h=ef6dbf1e4 --
    blockjob: Add "ready" field

    "When a block job signals readiness, this is currently reported only
    through QMP. If qemu wants to use block jobs for internal tasks,
    there needs to be another way to correctly detect when a block job
    may be completed.
    
    For this reason, introduce a bool "ready" which is set when the
    block job may be completed."


And, libvirt was fixed to use the above field in this commit (available
in libvirt v1.2.18 or above):

    http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=eae5924 -- qemu:
    Update state of block job to READY only if it actually is ready


RFC
---

Currently libvirt block APIs (& consequently higher-level applications
like Nova which use these APIs) rely on polling for job completion via
virDomainGetBlockJobInfo(), which uses QMP `query-block-jobs`, and
waits for QEMU to report "offset" == "len", which translates to libvirt
"cur" == "end".  Based on this, libvirt can take an action (whether to
gracefully abort, or pivot to the copy in case of a COPY job).

Since QEMU reports the "ready": true field (followed by a
BLOCK_JOB_READY QMP event).  It would be helpful if libvirt expose this
via an API, so upper layers could instead use that, rather than polling.


Problem scenario
----------------

When virDomainBlockRebase() is invoked to start a copy job, then
aborting the said copy operation with virDomainBlockJobAbort() + flag
VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT can result in a potential race
condition (due to the way the virDomainGetBlockJobInfo() reports the job
status) where the pivot operation fails.

Race condition window
~~~~~~~~~~~~~~~~~~~~~

libvirt finds cur==end AND sends a pivot request, all in the window
before QEMU would have sent "ready": true field [emitted as part of the
QMP `query-block-jobs` command's response, indicating that the job has
actually completed], however the pivot request fails because it requires
"ready": true.

So Eric Blake suggests:

    QEMU 2.0 or 1.x probably had a synchronous setup where you could
    never observer cur==end on a non-ready job. But I don't remember
    enough history to point to when QEMU switched jobs to be a bit more
    asynchronous.  Maybe there was no qemu regression - maybe it was
    BECAUSE of other block-job additions in 2.2 that offset==len was no
    longer reliable.  I don't know that for sure.

    But what it DOES sound like is that IF qemu reports "ready": false,
    offset==len is not reliable, and libvirt should be taught to fudge
    that.

    And hopefully, QEMU too old to report "ready:" at all is reliable
    with regards to offset==len, because that's all we have to go by.


For now, I filed this upstream libvirt bug:

    https://bugzilla.redhat.com/show_bug.cgi?id=1382165 --
    virDomainGetBlockJobInfo: Adjust job reporting based on QEMU stats &
    the "ready" field of `query-block-jobs`

However, exposing the "ready" boolean from QMP `query-block-jobs` might
be worth considering.

-- 
/kashyap




More information about the libvir-list mailing list