[libvirt] [PATCHv3 01/16] blockjob: add API for async virDomainBlockJobAbort

Laine Stump laine at laine.org
Mon Apr 9 20:45:08 UTC 2012


On 04/06/2012 02:29 PM, Eric Blake wrote:
> From: Adam Litke <agl at us.ibm.com>
>
> Qemu has changed the semantics of the "block_job_cancel" API.  The original
> qed implementation (pretty much only backported to RHEL 6.2 qemu) was
> synchronous (ie. upon command completion, the operation was guaranteed to
> be completely stopped).  With the new semantics going into qemu 1.1 for
> qcow2, a "block_job_cancel" merely requests that the operation be cancelled
> and an event is triggered once the cancellation request has been honored.
>
> To adopt the new semantics while preserving compatibility the following
> updates are made to the virDomainBlockJob API:
>
> A new block job event type VIR_DOMAIN_BLOCK_JOB_CANCELLED is recognized by
> libvirt.  Regardless of the flags used with virDomainBlockJobAbort, this event
> will be raised whenever it is received from qemu.  This event indicates that a
> block job has been successfully cancelled.  For now, libvirt does not try
> to synthesize this event if using an older qemu that did not generate it.
>
> A new extension flag VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC is added to the
> virDomainBlockJobAbort API.  When enabled, this function will operate
> asynchronously (ie, it can return before the job has actually been cancelled).
> When the API is used in this mode, it is the responsibility of the caller to
> wait for a VIR_DOMAIN_BLOCK_JOB_CANCELLED event or poll via the
> virDomainGetBlockJobInfo API to check the cancellation status; this flag
> is an error if it is not known if the hypervisor supports asynchronous cancel.
>
> This patch also exposes the new flag through virsh.
>
> Signed-off-by: Adam Litke <agl at us.ibm.com>
> Cc: Stefan Hajnoczi <stefanha at gmail.com>
> Signed-off-by: Eric Blake <eblake at redhat.com>
> ---
>  include/libvirt/libvirt.h.in |   10 +++++++++
>  src/libvirt.c                |   10 ++++++++-
>  src/qemu/qemu_monitor_json.c |   42 +++++++++++++++++++++++++++++++------
>  tools/virsh.c                |   46 ++++++++++++++++++++++++++---------------
>  tools/virsh.pod              |    9 +++++--
>  5 files changed, 89 insertions(+), 28 deletions(-)
>
> diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in
> index 499dcd4..97ad99d 100644
> --- a/include/libvirt/libvirt.h.in
> +++ b/include/libvirt/libvirt.h.in
> @@ -1946,6 +1946,15 @@ typedef enum {
>  #endif
>  } virDomainBlockJobType;
>
> +/**
> + * virDomainBlockJobAbortFlags:
> + *
> + * VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC: Request only, do not wait for completion
> + */
> +typedef enum {
> +    VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC = 1 << 0,
> +} virDomainBlockJobAbortFlags;
> +
>  /* An iterator for monitoring block job operations */
>  typedef unsigned long long virDomainBlockJobCursor;
>
> @@ -3617,6 +3626,7 @@ typedef void (*virConnectDomainEventGraphicsCallback)(virConnectPtr conn,
>  typedef enum {
>      VIR_DOMAIN_BLOCK_JOB_COMPLETED = 0,
>      VIR_DOMAIN_BLOCK_JOB_FAILED = 1,
> +    VIR_DOMAIN_BLOCK_JOB_CANCELED = 2,


You know, this is one case where I think the Queen's English has it
right - something just seems *wrong* about spelling CANCELED with one "L"...


>
>  #ifdef VIR_ENUM_SENTINELS
>      VIR_DOMAIN_BLOCK_JOB_LAST
> diff --git a/src/libvirt.c b/src/libvirt.c
> index 16d1fd5..af22232 100644
> --- a/src/libvirt.c
> +++ b/src/libvirt.c
> @@ -17902,7 +17902,7 @@ error:
>   * virDomainBlockJobAbort:
>   * @dom: pointer to domain object
>   * @disk: path to the block device, or device shorthand
> - * @flags: extra flags; not used yet, so callers should always pass 0
> + * @flags: bitwise-OR of virDomainBlockJobAbortFlags
>   *
>   * Cancel the active block job on the given disk.
>   *
> @@ -17913,6 +17913,14 @@ error:
>   * can be found by calling virDomainGetXMLDesc() and inspecting
>   * elements within //domain/devices/disk.
>   *
> + * By default, this function performs a synchronous operation and the caller
> + * may assume that the operation has completed when 0 is returned.  However,
> + * BlockJob operations may take a long time to complete, and during this time
> + * further domain interactions may be unresponsive.  To avoid this problem,
> + * pass VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC in the @flags argument to enable
> + * asynchronous behavior.  Either way, when the job has been cancelled, a
> + * BlockJob event will be emitted, with status VIR_DOMAIN_BLOCK_JOB_CANCELLED.
> + *
>   * Returns -1 in case of failure, 0 when successful.
>   */
>  int virDomainBlockJobAbort(virDomainPtr dom, const char *disk,
> diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c
> index d09d779..c9f0f0c 100644
> --- a/src/qemu/qemu_monitor_json.c
> +++ b/src/qemu/qemu_monitor_json.c
> @@ -58,13 +58,14 @@ static void qemuMonitorJSONHandleIOError(qemuMonitorPtr mon, virJSONValuePtr dat
>  static void qemuMonitorJSONHandleVNCConnect(qemuMonitorPtr mon, virJSONValuePtr data);
>  static void qemuMonitorJSONHandleVNCInitialize(qemuMonitorPtr mon, virJSONValuePtr data);
>  static void qemuMonitorJSONHandleVNCDisconnect(qemuMonitorPtr mon, virJSONValuePtr data);
> -static void qemuMonitorJSONHandleBlockJob(qemuMonitorPtr mon, virJSONValuePtr data);
>  static void qemuMonitorJSONHandleSPICEConnect(qemuMonitorPtr mon, virJSONValuePtr data);
>  static void qemuMonitorJSONHandleSPICEInitialize(qemuMonitorPtr mon, virJSONValuePtr data);
>  static void qemuMonitorJSONHandleSPICEDisconnect(qemuMonitorPtr mon, virJSONValuePtr data);
>  static void qemuMonitorJSONHandleTrayChange(qemuMonitorPtr mon, virJSONValuePtr data);
>  static void qemuMonitorJSONHandlePMWakeup(qemuMonitorPtr mon, virJSONValuePtr data);
>  static void qemuMonitorJSONHandlePMSuspend(qemuMonitorPtr mon, virJSONValuePtr data);
> +static void qemuMonitorJSONHandleBlockJobCompleted(qemuMonitorPtr mon, virJSONValuePtr data);
> +static void qemuMonitorJSONHandleBlockJobCanceled(qemuMonitorPtr mon, virJSONValuePtr data);
>
>  static struct {
>      const char *type;
> @@ -80,13 +81,14 @@ static struct {
>      { "VNC_CONNECTED", qemuMonitorJSONHandleVNCConnect, },
>      { "VNC_INITIALIZED", qemuMonitorJSONHandleVNCInitialize, },
>      { "VNC_DISCONNECTED", qemuMonitorJSONHandleVNCDisconnect, },
> -    { "BLOCK_JOB_COMPLETED", qemuMonitorJSONHandleBlockJob, },
>      { "SPICE_CONNECTED", qemuMonitorJSONHandleSPICEConnect, },
>      { "SPICE_INITIALIZED", qemuMonitorJSONHandleSPICEInitialize, },
>      { "SPICE_DISCONNECTED", qemuMonitorJSONHandleSPICEDisconnect, },
>      { "DEVICE_TRAY_MOVED", qemuMonitorJSONHandleTrayChange, },
>      { "WAKEUP", qemuMonitorJSONHandlePMWakeup, },
>      { "SUSPEND", qemuMonitorJSONHandlePMSuspend, },
> +    { "BLOCK_JOB_COMPLETED", qemuMonitorJSONHandleBlockJobCompleted, },
> +    { "BLOCK_JOB_CANCELLED", qemuMonitorJSONHandleBlockJobCanceled, },
>  };
>
>
> @@ -754,13 +756,15 @@ static void qemuMonitorJSONHandleSPICEDisconnect(qemuMonitorPtr mon, virJSONValu
>      qemuMonitorJSONHandleGraphics(mon, data, VIR_DOMAIN_EVENT_GRAPHICS_DISCONNECT);
>  }
>
> -static void qemuMonitorJSONHandleBlockJob(qemuMonitorPtr mon, virJSONValuePtr data)
> +static void
> +qemuMonitorJSONHandleBlockJobImpl(qemuMonitorPtr mon,
> +                                  virJSONValuePtr data,
> +                                  int event)
>  {
>      const char *device;
>      const char *type_str;
>      int type = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN;
>      unsigned long long offset, len;
> -    int status = VIR_DOMAIN_BLOCK_JOB_FAILED;
>
>      if ((device = virJSONValueObjectGetString(data, "device")) == NULL) {
>          VIR_WARN("missing device in block job event");
> @@ -785,11 +789,19 @@ static void qemuMonitorJSONHandleBlockJob(qemuMonitorPtr mon, virJSONValuePtr da
>      if (STREQ(type_str, "stream"))
>          type = VIR_DOMAIN_BLOCK_JOB_TYPE_PULL;
>
> -    if (offset != 0 && offset == len)
> -        status = VIR_DOMAIN_BLOCK_JOB_COMPLETED;
> +    switch (event) {
> +        case VIR_DOMAIN_BLOCK_JOB_COMPLETED:
> +            /* Make sure the whole device has been processed */
> +            if (offset != len)
> +                event = VIR_DOMAIN_BLOCK_JOB_FAILED;
> +            break;


Okay, so if you're talking to an "old" qemu, it will return a COMPLETED
event with offset < length to indicate failure, which is handled here...


> +        case VIR_DOMAIN_BLOCK_JOB_FAILED:


and if it's a new qemu, it will return FAILED explicitly, which is
handled here. What happens when it's an old libvirt and new qemu? (not
that there's anything libvirt could do about it, I'm just curious)


> +        case VIR_DOMAIN_BLOCK_JOB_CANCELED:
> +            break;
> +    }
>
>  out:
> -    qemuMonitorEmitBlockJob(mon, device, type, status);
> +    qemuMonitorEmitBlockJob(mon, device, type, event);
>  }
>
>  static void
> @@ -832,6 +844,22 @@ qemuMonitorJSONHandlePMSuspend(qemuMonitorPtr mon,
>      qemuMonitorEmitPMSuspend(mon);
>  }
>
> +static void
> +qemuMonitorJSONHandleBlockJobCompleted(qemuMonitorPtr mon,
> +                                       virJSONValuePtr data)
> +{
> +    qemuMonitorJSONHandleBlockJobImpl(mon, data,
> +                                      VIR_DOMAIN_BLOCK_JOB_COMPLETED);
> +}
> +
> +static void
> +qemuMonitorJSONHandleBlockJobCanceled(qemuMonitorPtr mon,
> +                                       virJSONValuePtr data)
> +{
> +    qemuMonitorJSONHandleBlockJobImpl(mon, data,
> +                                      VIR_DOMAIN_BLOCK_JOB_CANCELED);
> +}
> +
>  int
>  qemuMonitorJSONHumanCommandWithFd(qemuMonitorPtr mon,
>                                    const char *cmd_str,
> diff --git a/tools/virsh.c b/tools/virsh.c
> index cfdd040..7180b83 100644
> --- a/tools/virsh.c
> +++ b/tools/virsh.c
> @@ -7525,6 +7525,7 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd,
>      const char *name, *path;
>      unsigned long bandwidth = 0;
>      int ret = -1;
> +    unsigned int flags = 0;
>
>      if (!vshConnectionUsability(ctl, ctl->conn))
>          goto cleanup;
> @@ -7541,7 +7542,9 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd,
>      }
>
>      if (mode == VSH_CMD_BLOCK_JOB_ABORT) {
> -        ret = virDomainBlockJobAbort(dom, path, 0);
> +        if (vshCommandOptBool(cmd, "async"))
> +            flags |= VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC;
> +        ret = virDomainBlockJobAbort(dom, path, flags);
>      } else if (mode == VSH_CMD_BLOCK_JOB_INFO) {
>          ret = virDomainGetBlockJobInfo(dom, path, info, 0);
>      } else if (mode == VSH_CMD_BLOCK_JOB_SPEED) {
> @@ -7589,20 +7592,25 @@ cmdBlockPull(vshControl *ctl, const vshCmd *cmd)
>  }
>
>  /*
> - * "blockjobinfo" command
> + * "blockjob" command
>   */
>  static const vshCmdInfo info_block_job[] = {
> -    {"help", N_("Manage active block operations.")},
> -    {"desc", N_("Manage active block operations.")},
> +    {"help", N_("Manage active block operations")},
> +    {"desc", N_("Query, adjust speed, or cancel active block operations.")},
>      {NULL, NULL}
>  };
>
>  static const vshCmdOptDef opts_block_job[] = {
>      {"domain", VSH_OT_DATA, VSH_OFLAG_REQ, N_("domain name, id or uuid")},
>      {"path", VSH_OT_DATA, VSH_OFLAG_REQ, N_("Fully-qualified path of disk")},
> -    {"abort", VSH_OT_BOOL, VSH_OFLAG_NONE, N_("Abort the active job on the specified disk")},
> -    {"info", VSH_OT_BOOL, VSH_OFLAG_NONE, N_("Get active job information for the specified disk")},
> -    {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE, N_("Set the Bandwidth limit in MB/s")},
> +    {"abort", VSH_OT_BOOL, VSH_OFLAG_NONE,
> +     N_("Abort the active job on the specified disk")},
> +    {"async", VSH_OT_BOOL, VSH_OFLAG_NONE,
> +     N_("don't wait for --abort to complete")},
> +    {"info", VSH_OT_BOOL, VSH_OFLAG_NONE,
> +     N_("Get active job information for the specified disk")},
> +    {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE,
> +     N_("Set the Bandwidth limit in MB/s")},
>      {NULL, 0, 0, NULL}
>  };
>
> @@ -7613,19 +7621,23 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd)
>      virDomainBlockJobInfo info;
>      const char *type;
>      int ret;
> +    bool abortMode = vshCommandOptBool(cmd, "abort");
> +    bool infoMode = vshCommandOptBool(cmd, "info");
> +    bool bandwidth = vshCommandOptBool(cmd, "bandwidth");
>
> -    if (vshCommandOptBool (cmd, "abort")) {
> -        mode = VSH_CMD_BLOCK_JOB_ABORT;
> -    } else if (vshCommandOptBool (cmd, "info")) {
> -        mode = VSH_CMD_BLOCK_JOB_INFO;
> -    } else if (vshCommandOptBool (cmd, "bandwidth")) {
> -        mode = VSH_CMD_BLOCK_JOB_SPEED;
> -    } else {
> +    if (abortMode + infoMode + bandwidth > 1) {
>          vshError(ctl, "%s",
>                   _("One of --abort, --info, or --bandwidth is required"));


You're changing the test here to require 0 or 1 of the options, rather
than exactly one, effectively making --info the default. That makes
sense, but I think the error message should be a bit different, maybe
"no more than one of --abort, --info, or --bandwidth is allowed (default
is --info)".


>          return false;
>      }
>
> +    if (abortMode)
> +        mode = VSH_CMD_BLOCK_JOB_ABORT;
> +    else if (bandwidth)
> +        mode = VSH_CMD_BLOCK_JOB_SPEED;
> +    else
> +        mode = VSH_CMD_BLOCK_JOB_INFO;
> +
>      ret = blockJobImpl(ctl, cmd, &info, mode);
>      if (ret < 0)
>          return false;
> @@ -7634,13 +7646,13 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd)
>          return true;
>
>      if (info.type == VIR_DOMAIN_BLOCK_JOB_TYPE_PULL)
> -        type = "Block Pull";
> +        type = _("Block Pull");
>      else
> -        type = "Unknown job";
> +        type = _("Unknown job");
>
>      print_job_progress(type, info.end - info.cur, info.end);
>      if (info.bandwidth != 0)
> -        vshPrint(ctl, "    Bandwidth limit: %lu MB/s\n", info.bandwidth);
> +        vshPrint(ctl, _("    Bandwidth limit: %lu MB/s\n"), info.bandwidth);
>      return true;
>  }
>
> diff --git a/tools/virsh.pod b/tools/virsh.pod
> index a60e667..aac72d1 100644
> --- a/tools/virsh.pod
> +++ b/tools/virsh.pod
> @@ -686,13 +686,16 @@ Both I<--live> and I<--current> flags may be given, but I<--current> is
>  exclusive. If no flag is specified, behavior is different depending
>  on hypervisor.
>
> -=item B<blockjob> I<domain> I<path> [I<--abort>] [I<--info>] [I<bandwidth>]
> +=item B<blockjob> I<domain> I<path> { I<--abort> [I<--async>] |
> +[I<--info>] | I<bandwidth> }


A bit awkward to parse, but it does eventually make sense :-)


>
> -Manage active block operations.
> +Manage active block operations.  If no mode is chosen, I<--info> is assumed.
>
>  I<path> specifies fully-qualified path of the disk.
> +
>  If I<--abort> is specified, the active job on the specified disk will
> -be aborted.
> +be aborted.  If I<--async> is also specified, this command will return
> +immediately, rather than waiting for the cancelation to complete.
>  If I<--info> is specified, the active job information on the specified
>  disk will be printed.
>  I<bandwidth> can be used to set bandwidth limit for the active job.

ACK with the change to the error message.




More information about the libvir-list mailing list