[libvirt] [Qemu-devel] virDomainBlockJobAbort and block_job_cancel

Adam Litke agl at us.ibm.com
Thu Dec 8 14:55:56 UTC 2011

On Wed, Dec 07, 2011 at 04:01:58PM -0700, Eric Blake wrote:
> On 12/07/2011 03:35 PM, Adam Litke wrote:
> > Stefan's qemu tree has a block_job_cancel command that always acts
> > asynchronously.  In order to provide the synchronous behavior in libvirt (when
> > flags is 0), I need to wait for the block job to go away.  I see two options:
> > 
> > 1) Use the event:
> > To implement this I would create an internal event callback function and
> > register it to receive the block job events.  When the cancel event comes in, I
> > can exit the qemu job context.  One problem I see with this is that events are
> > only available when using the json monitor mode.
> I like this idea.  We have internally handled events before, and limited
> it to just JSON if that made life easier: for example, virDomainReboot
> on qemu is rejected if you only have the HMP monitor, and if you have
> the JSON monitor, the implementation internally handles the event to
> change the domain state.
> Can we reliably detect whether qemu is new enough to provide the event,
> and if qemu was older and did not provide the event, do we reliably know
> that abort was blocking in that case?

I think we can say that qemu will operate in one of two modes:
a) Blocking abort AND event is not emitted
b) Non-blocking abort AND event is emitted

The difficulty is in detecting which case the current qemu supports.  I don't
believe there is a way to query qemu for a list of currently-supported events.
Therefore, we'd have to use version numbers.  If we do this, how do we avoid
breaking users of qemu git versions that fall between official qemu releases?

> It's okay to make things work that used to fail, but it is a regression
> to make blocking job cancel fail where it used to work, so rejecting
> blocking job cancel with HMP monitor is not a good idea.  If we can
> guarantee that all qemu new enough to have async cancel also support the
> event, while older qemu was always blocking, and that we can always use
> the JSON monitor to newer qemu, then we're set - merely ensure that we
> use only the JSON monitor and the event.  But if we can't make the
> guarantees, and insist on supporting newer qemu via HMP monitor, then we
> may need a hybrid combination of options 1 and 2, or for less code
> maintenance, just option 2.

Is there a deprecation plan for HMP with newer qemu versions?  I really hate the
idea of needing two implementations for this: one polling and one event-based.

> > 2) Poll the qemu monitor
> > To do it this way, I would write a function that repeatedly calls
> > virDomainGetBlockJobInfo() against the disk in question.  Once the job
> > disappears from the list I can return with confidence that the job is gone.
> > This is obviously sub-optimal because I need to poll and sleep.
> We've done this before, for both HMP and JSON - see
> qemuMigrationWaitForCompletion.  I agree that an event is nicer than
> polling, but we may be locked into this.
> > 
> > 3) Ask Stefan to provide a synchronous mode for the qemu monitor command
> > This one is the nicest from a libvirt perspective, but I doubt qemu wants to add
> > such an interface given that it basically has broken semantics as far as qemu is
> > concerned.
> Or even:
> 4) Ask Stefan to make the HMP monitor command synchronous, but only
> expose the JSON command as asynchronous.  After all, it is only HMP
> where we can't wait for an event.

Stefan, how 'bout it?

> > 
> > If this is all too nasty, we could probably just change the behavior of
> > blockJobAbort and make it always synchronous with a 'cancelled' event.
> No, we can't change the behavior without breaking back-compat of
> existing clients of job cancellation.

Adam Litke <agl at us.ibm.com>
IBM Linux Technology Center

More information about the libvir-list mailing list