[libvirt] RFC: Exposing "ready" bool (of `query-block-jobs`) or QMP BLOCK_JOB_READY event

Peter Krempa pkrempa at redhat.com
Thu Oct 6 11:34:59 UTC 2016


On Thu, Oct 06, 2016 at 12:26:50 +0200, Kashyap Chamarthy wrote:
> Backround
> ---------
> 
> For QEMU block device jobs, the "ready" boolean field (part of QMP
> `query-block-jobs`) was introduced in commit ef6dbf1 (available in QEMU
> v2.2.0 or above):
> 
>     http://git.qemu.org/?p=qemu.git;a=commitdiff;h=ef6dbf1e4 --
>     blockjob: Add "ready" field
> 
>     "When a block job signals readiness, this is currently reported only
>     through QMP. If qemu wants to use block jobs for internal tasks,
>     there needs to be another way to correctly detect when a block job
>     may be completed.
>     
>     For this reason, introduce a bool "ready" which is set when the
>     block job may be completed."
> 
> 
> And, libvirt was fixed to use the above field in this commit (available
> in libvirt v1.2.18 or above):
> 
>     http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=eae5924 -- qemu:
>     Update state of block job to READY only if it actually is ready [1]
> 
> 
> RFC
> ---
> 
> Currently libvirt block APIs (& consequently higher-level applications
> like Nova which use these APIs) rely on polling for job completion via

libvirt is _not_ polling the data. Libvirt relies on the events from
qemu which are also exposed as libvirt events.

> virDomainGetBlockJobInfo(), which uses QMP `query-block-jobs`, and
> waits for QEMU to report "offset" == "len", which translates to libvirt
> "cur" == "end".  Based on this, libvirt can take an action (whether to
> gracefully abort, or pivot to the copy in case of a COPY job).

Again false. Internally we honor the ready state and applications like
virsh have code in place that waits for the async events and act after
receiving them.

> Since QEMU reports the "ready": true field (followed by a
> BLOCK_JOB_READY QMP event).  It would be helpful if libvirt expose this
> via an API, so upper layers could instead use that, rather than polling.

We expose the state of the copy job in the XML and forward the READY
event from qemu to the users.

A running copy job exposes itself in the xml as:

    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/systemrescuecd-x86-4.8.0.iso'/>
      <backingStore/>
      <mirror type='file' file='/tmp/ble.img' format='raw' job='copy'>
        <format type='raw'/>
        <source file='/tmp/ble.img'/>
      </mirror>
      [...]
    </disk>

While the ready copy job is exposed as:

    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/systemrescuecd-x86-4.8.0.iso'/>
      <backingStore/>
      <mirror type='file' file='/tmp/ble.img' format='raw' job='copy' ready='yes'>
        <format type='raw'/>
        <source file='/tmp/ble.img'/>
      </mirror>
      [...]
    </disk>

Additionally we have anyncrhronous events that are emitted once qemu
notifies us that the block job has reached sync state or finished.
Libvirt uses the event to switch to the ready state.

The documentation suggests that block jobs should listen to the events
and act accordingly only after receiving the event.


> Problem scenario
> ----------------
> 
> When virDomainBlockRebase() is invoked to start a copy job, then
> aborting the said copy operation with virDomainBlockJobAbort() + flag
> VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT can result in a potential race
> condition (due to the way the virDomainGetBlockJobInfo() reports the job
> status) where the pivot operation fails.

Again, this should trigger the block job event and that should be used
as the marker to perform the pivot operation.

$ virsh event --all --loop
event 'block-job-2' for domain hp: Block Copy for hda ready  <- after the sync phase is reached
event 'block-job-2' for domain hp: Block Copy for hda completed <- after successfully pivoting

Applications should act only after receiving such event.


> Race condition window
> ~~~~~~~~~~~~~~~~~~~~~
> 
> libvirt finds cur==end AND sends a pivot request, all in the window
> before QEMU would have sent "ready": true field [emitted as part of the

This is not true. Libvirt checks that the mirror is actually ready. It's
done by the commit you've mentioned above.

> QMP `query-block-jobs` command's response, indicating that the job has
> actually completed], however the pivot request fails because it requires
> "ready": true.

The problem is that you are polling the block job info which correctly
reports that all data was properly copied and you are inferring the
block job state from that data.

I'm against deliberately reporting false data in the block info
structure.

The application should register handlers for the block job events and
act only if it receives such event. Additionally you may want to check
that the state is correct in the XML. The current block job information
structure can't be extended unfortunately.

Please look at the virsh handling of the block jobs for inspiration
since we have example code doing the right thing here.

Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20161006/61d034fc/attachment-0001.sig>


More information about the libvir-list mailing list