[libvirt] [RFC] New domain job control and stat APIs

Peter Krempa pkrempa at redhat.com
Wed Jul 10 12:27:21 UTC 2019


Currently we don't have a consolidated approach for managing
asynchronous long-running domain jobs. Historically there were
long-running jobs which interlocked with each other and thus there was
only one such job possible at given time (migration, save, restore, dump)

These jobs have a not very flexible set of APIs:
virDomainGetJobInfo, virDomainGetJobStats, virDomainAbortJob.

These don't really allow selecting which job to terminate since there's
only one, thus if we wanted to add different kinds of jobs which not
necessarily interlock but are able to run in parallel we had to
introduce another set of APIs.

This resulted into creation of block job APIs:
virDomainBlockJobAbort, virDomainGetBlockJobInfo

These allow parallel jobs (discriminated by disk to which the job
belongs) but are not universal and nor allow parallel jobs on a single
disk.

Similarly blockjobs can also become detached from the disk e.g. if the
guest unplugs the disk fronted. That way the job would linger in a limbo
and would not be controllable. (This is certainly a possibility with
-blockdev).

With -blockdev we also get a potentially long-running blockdev-create
job which is not bound to any disk as part of kicking of a snapshot or
block copy job. This one might also get stuck and in the current state
is not really controllable.

Additionally the upcomming block-backup job will be a combination of the
above. It's a job which spans multiple disks (thus not really a block
job in libvirt terminology) but not a domain job either as there
can be potentially more than one block backup job. The proposal for
block-backup introduces it's own single-purpose set of APIs for managing
the backup job only, but abuses the block job and domain job events to
distribute the async state updates.

With this series I want to introduce a set of APIs for managing the jobs
which are designed to be universal enough and a new event so that noting
will try to retrofit onto existing infrastructure.

An example of the job XML would be:

<job type='block-commit-active' state='ready'>
  <config>
    <disk>vda</disk>
    <top>vda[1]</top>
    <base>vda[5]</base>
  </config>
  <stats>
   <current>12345</current>
   <end>12345</current>
  </stats>
</job>

but this will be mostly a topic for the second part of this excercise
after we discuss the APIs.

The new infrastructure will also allow adding a flag for all the
existing APIs which kick-off a job so that the job will persist even
after it finishes. This will also properly implement the statistics for
a finished migration and similar.

Obviously we will need to take special care when wiring up these so that
the old APIs work for old situations and also the events are reported
correctly.

The initial idea would be to implement the stats XML both for the domain
jobs (migration, dump) and blockjobs to simplify the job for mgmt apps
so that they won't have to infer whether the given job type is already
reported in the new API.

Additionally we can also implement flags for the XML getter API that
will skip the stats gathering as that may require monitor interactions.
Also one possibility would be to return an abbreviated XML in the
listing API.
---
 include/libvirt/libvirt-domain.h | 91 +++++++++++++++++++++++++++++++
 src/libvirt-domain.c             | 94 ++++++++++++++++++++++++++++++++
 2 files changed, 185 insertions(+)

diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
index 2dbd74d4f3..dac77771be 100644
--- a/include/libvirt/libvirt-domain.h
+++ b/include/libvirt/libvirt-domain.h
@@ -4485,6 +4485,28 @@ typedef void (*virConnectDomainEventBlockThresholdCallback)(virConnectPtr conn,
                                                             unsigned long long excess,
                                                             void *opaque);

+/**
+ * virConnectDomainEventJobStateCallback:
+ * @conn: connection object
+ * @dom: domain on which the event occurred
+ * @jobname: name of job which changed state
+ * @jobtype: type of the job
+ * @newstate: the new state the job entered
+ * @opaque: application specified data
+ *
+ * The callback occurs when a long running domain job (see virDomainJobList)
+ * changes state.
+ *
+ * The callback signature to use when registering for an event of type
+ * VIR_DOMAIN_EVENT_ID_JOB_STATE with virConnectDomainEventRegisterAny()
+ */
+typedef void (*virConnectDomainEventJobStateCallback)(virConnectPtr conn,
+                                                      virDomainPtr dom,
+                                                      const char *jobname,
+                                                      virDomainJobType jobtype,
+                                                      virDomainJobState newstate,
+                                                      void *opaque);
+
 /**
  * VIR_DOMAIN_EVENT_CALLBACK:
  *
@@ -4527,6 +4549,7 @@ typedef enum {
     VIR_DOMAIN_EVENT_ID_DEVICE_REMOVAL_FAILED = 22, /* virConnectDomainEventDeviceRemovalFailedCallback */
     VIR_DOMAIN_EVENT_ID_METADATA_CHANGE = 23, /* virConnectDomainEventMetadataChangeCallback */
     VIR_DOMAIN_EVENT_ID_BLOCK_THRESHOLD = 24, /* virConnectDomainEventBlockThresholdCallback */
+    VIR_DOMAIN_EVENT_ID_JOB_STATE = 25, /* virConnectDomainEventJobStateCallback */

 # ifdef VIR_ENUM_SENTINELS
     VIR_DOMAIN_EVENT_ID_LAST
@@ -4896,4 +4919,72 @@ int virDomainGetLaunchSecurityInfo(virDomainPtr domain,
                                    int *nparams,
                                    unsigned int flags);

+typedef enum {
+    VIR_DOMAIN_JOB_TYPE_NONE = 0,
+    VIR_DOMAIN_JOB_TYPE_MIGRATION = 1,
+    VIR_DOMAIN_JOB_TYPE_BLOCK_PULL = 2,
+    [...]
+
+# ifdef VIR_ENUM_SENTINELS
+    VIR_DOMAIN_JOB_TYPE_LAST
+# endif
+} virDomainJobType;
+
+
+typedef enum {
+    VIR_DOMAIN_JOB_STATE_NONE = 0, /* unknown job state */
+    VIR_DOMAIN_JOB_STATE_RUNNING = 1, /* job is currently running */
+    VIR_DOMAIN_JOB_STATE_READY = 2, /* job reached a synchronized state and may be finalized */
+    VIR_DOMAIN_JOB_STATE_FAILED = 3, /* job has failed */
+    VIR_DOMAIN_JOB_STATE_COMPLETED = 4, /* job has completed successfully */
+    VIR_DOMAIN_JOB_STATE_ABORTED = 5, /* job has been aborted */
+    [...]
+
+# ifdef VIR_ENUM_SENTINELS
+    VIR_DOMAIN_JOB_STATE_LAST
+# endif
+} virDomainJobState;
+
+
+typedef struct _virDomainJob virDomainJob;
+typedef virDomainJob *virDomainJobPtr;
+struct _virDomainJob {
+    char *name;
+    virDomainJobType type;
+    virDomainJobState state;
+
+    /* possibly overkill? - currently empty*/
+    virTypedParameterPtr data;
+    size_t ndata;
+};
+
+
+void virDomainJobFree(virDomainJobPtr job);
+
+int virDomainJobList(virDomainPtr domain,
+                     virDomainJobPtr **jobs,
+                     unsigned int flags);
+
+int virDomainJobGetXMLDesc(virDomainPtr domain,
+                           const char *jobname,
+                           unsigned int flags);
+
+typedef enum {
+    VIR_DOMAIN_JOB_CONTROL_OPERATION_NONE = 0,
+    VIR_DOMAIN_JOB_CONTROL_OPERATION_ABORT = 1,
+    VIR_DOMAIN_JOB_CONTROL_OPERATION_FINALIZE = 2,
+    VIR_DOMAIN_JOB_CONTROL_OPERATION_PAUSE = 3,
+    VIR_DOMAIN_JOB_CONTROL_OPERATION_RESUME = 4,
+    VIR_DOMAIN_JOB_CONTROL_OPERATION_DISMISS = 5,
+
+# ifdef VIR_ENUM_SENTINELS
+    VIR_DOMAIN_JOB_CONTROL_OPERATION_LAST
+# endif
+} virDomainJobControlOperation;
+
+int virDomainJobControl(virDomainPtr domain,
+                        const char *jobname,
+                        virDomainJobControlOperation op,
+                        unsigned int flags);
+
 #endif /* LIBVIRT_DOMAIN_H */
diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c
index 3d12e7c125..aa5571818f 100644
--- a/src/libvirt-domain.c
+++ b/src/libvirt-domain.c
@@ -12362,3 +12362,97 @@ int virDomainGetLaunchSecurityInfo(virDomainPtr domain,
     virDispatchError(domain->conn);
     return -1;
 }
+
+
+/**
+ * virDomainJobFree:
+ * @job: pointer to virDomainJob object
+ *
+ * Frees the memory associated with @job.
+ */
+void
+virDomainJobFree(virDomainJobPtr job)
+{
+    [...]
+}
+
+
+/**
+ * virDomainJobList:
+ * @domain: pointer to a domain
+ * @jobs: Pointer to a variable to store the array containing job description
+ *        objects or NULL if the list is not required.
+ * @flags: optional flags (currently unused, callers should always pass 0)
+ *
+ * Collects a list of background jobs associated with @domain and returns it in
+ * an allocated array of virDomainJobPtr structs. The jobs include migration jobs
+ * block jobs and any other possibly long running asynchronous operation.
+ *
+ * The caller is responsible for freeing the members of the returned @jobs array
+ * using virDomainJobFree and the whole array using free();
+ *
+ * Returns the number of jobs running on @domain on success (optionally filling
+ * @jobs if non-NULL) or -1 on error (value of @jobs is set to NULL).
+ */
+int
+virDomainJobList(virDomainPtr domain,
+                 virDomainJobPtr **jobs,
+                 unsigned int flags)
+{
+    [...]
+}
+
+
+/**
+ * virDomainJobGetXMLDesc:
+ * @domain: pointer to a domain
+ * @jobname: name of the domain job to operate on
+ * @flags: optional flags (currently unused, callers should always pass 0)
+ *
+ * Returns a string containing an UTF-8 encoded XML document describing the
+ * configuration, state and progress of domain job @jobname. Please refer to the
+ * job XML documentation for information on the format of the returned document.
+ *
+ * In case of error NULL is returned. Caller is responsible for free()-ing the
+ * returned string.
+ */
+char *
+virDomainJobGetXMLDesc(virDomainPtr domain,
+                       const char *jobname,
+                       unsigned int flags)
+{
+    [...]
+}
+
+
+/**
+ * virDomainJobControl:
+ * @domain: pointer to a domain
+ * @jobname: name of the domain job to operate on
+ * @op: operation to perform on @jobname
+ * @flags: optional flags (currently unused, callers should always pass 0)
+ *
+ * Requests change of state of @jobname. Note that it depends on the type of
+ * @jobname whether @op is supported.
+ *
+ * VIR_DOMAIN_JOB_CONTROL_OPERATION_FINALIZE are supported only with
+ * VIR_DOMAIN_JOB_TYPE_BLOCK_COPY and VIR_DOMAIN_JOB_TYPE_BLOCK_COMMIT_ACTIVE.
+ *
+ * VIR_DOMAIN_JOB_CONTROL_OPERATION_PAUSE and VIR_DOMAIN_JOB_CONTROL_OPERATION_RESUME
+ * are supported only with VIR_DOMAIN_JOB_TYPE_BLOCK_* type jobs.
+ *
+ * The request to change state is asynchronous and callers should install an
+ * event callback for VIR_DOMAIN_EVENT_ID_JOB_STATE if they wish to be notified
+ * when the state change occured. (Note that the callback may fire before this
+ * API returns).
+ *
+ * Returns 0 on success or -1 on error.
+ */
+int
+virDomainJobControl(virDomainPtr domain,
+                    const char *jobname,
+                    virDomainJobControlOperations op,
+                    unsigned int flags)
+{
+    [...]
+}
-- 
2.21.0




More information about the libvir-list mailing list