[libvirt] RFC: Exposing "ready" bool (of `query-block-jobs`) or QMP BLOCK_JOB_READY event

John Snow jsnow at redhat.com
Thu Oct 6 16:45:30 UTC 2016



On 10/06/2016 10:25 AM, Eric Blake wrote:
> On 10/06/2016 06:34 AM, Peter Krempa wrote:
>
>>> Currently libvirt block APIs (& consequently higher-level applications
>>> like Nova which use these APIs) rely on polling for job completion via
>>
>> libvirt is _not_ polling the data. Libvirt relies on the events from
>> qemu which are also exposed as libvirt events.
>
> Libvirt is not the one deciding when to issue the pivot command, Nova
> is.  Right now, Nova is polling (rather than waiting for events), and
> its polling is solely conditional on cur==end rather than on the XML
> addition of ready='true'.
>
>>
>> We expose the state of the copy job in the XML and forward the READY
>> event from qemu to the users.
>
> I was not aware of that when I was chatting on IRC yesterday; that's
> useful to know, because virDomainGetBlockJobInfo() is NOT exposing that
> information at the moment.
>
>> The documentation suggests that block jobs should listen to the events
>> and act accordingly only after receiving the event.
>
> Yes, but the documentation ALSO states that waiting for cur==end is
> SUPPOSED to work.  And it doesn't.
>
>>> ~~~~~~~~~~~~~~~~~~~~~
>>>
>>> libvirt finds cur==end AND sends a pivot request, all in the window
>>> before QEMU would have sent "ready": true field [emitted as part of the
>>
>> This is not true. Libvirt checks that the mirror is actually ready. It's
>> done by the commit you've mentioned above.
>
> In other words, Nova sees cur==end, and requests the pivot, but libvirt
> is rejecting Nova's request because 'ready' is not true yet; and Nova
> then gives up rather than trying again.
>
>>
>>> QMP `query-block-jobs` command's response, indicating that the job has
>>> actually completed], however the pivot request fails because it requires
>>> "ready": true.
>>
>> The problem is that you are polling the block job info which correctly
>> reports that all data was properly copied and you are inferring the
>> block job state from that data.
>
> But the problem here is that qemu is NOT accurately reporting data - it
> is reporting cur==end with the promise that they are only equal if the
> job is stable, WHEN THE JOB IS NOT STABLE.
>

Do we really promise that in QEMU? I guess since jobs have existed since 
before the ready field I guess we do...

>>
>> I'm against deliberately reporting false data in the block info
>> structure.
>
> We are NOT falsifying any information, any more than we are falsifying
> information by changing cur/end to 0/1 when ready:false and qemu
> reported 0/0.  (see commit 988218ca).
>
>>
>> The application should register handlers for the block job events and
>> act only if it receives such event. Additionally you may want to check
>> that the state is correct in the XML. The current block job information
>> structure can't be extended unfortunately.
>
> Yes, changing Nova to use event handlers is a good idea.  But I'm ALSO
> in favor of fixing libvirt to work around the qemu bug, by intentionally
> munging the output to state cur<end (even if qemu reported cur==end) if
> qemu reports ready:false.
>




More information about the libvir-list mailing list