[libvirt] [PATCH v2] qemu: report block job errors from qemu to the user

Nikolay Shirokovskiy nshirokovskiy at virtuozzo.com
Wed Nov 2 09:14:30 UTC 2016



On 31.10.2016 10:10, Nikolay Shirokovskiy wrote:
> 
> 
> On 28.10.2016 20:57, John Ferlan wrote:
>>
>> No commit message... This definitely needs one.
>>
>> Also when there's one patch, you don't need a cover letter - instead you
>> put whatever "message" you want to give to reviewers, the pointer to the
>> previous version, changes with the current code, etc. after the "---"
>> -----------------------------------------------+
>>                                                       |
>> On 10/14/2016 05:12 AM, Nikolay Shirokovskiy wrote:   |
>>> ---                       <-------------------------+
>>>  src/qemu/qemu_blockjob.c     | 13 +++++++++--
>>>  src/qemu/qemu_blockjob.h     |  3 ++-
>>>  src/qemu/qemu_domain.c       |  1 +
>>>  src/qemu/qemu_domain.h       |  1 +
>>>  src/qemu/qemu_driver.c       |  4 ++--
>>>  src/qemu/qemu_migration.c    | 54 +++++++++++++++++++++++++++++++-------------
>>>  src/qemu/qemu_monitor.c      |  5 ++--
>>>  src/qemu/qemu_monitor.h      |  4 +++-
>>>  src/qemu/qemu_monitor_json.c |  4 +++-
>>>  src/qemu/qemu_process.c      |  4 ++++
>>>  10 files changed, 68 insertions(+), 25 deletions(-)
>>>
>>
>> I think this could have been split into setting the blockJobError in
>> qemuMonitorJSONHandleBlockJobImpl along with the VIR_FREE in
>> qemuDomainDiskPrivateDispose and of course the set/fetch in
>> qemuMonitorJSONHandleBlockJobImpl.
>>
>> Then the second patch wold handle grabbing that error message...
>>
>>> diff --git a/src/qemu/qemu_blockjob.c b/src/qemu/qemu_blockjob.c
>>> index 83a5a3f..30beeeb 100644
>>> --- a/src/qemu/qemu_blockjob.c
>>> +++ b/src/qemu/qemu_blockjob.c
>>> @@ -33,6 +33,7 @@
>>>  #include "virstoragefile.h"
>>>  #include "virthread.h"
>>>  #include "virtime.h"
>>> +#include "viralloc.h"
>>>  
>>>  #define VIR_FROM_THIS VIR_FROM_QEMU
>>>  
>>> @@ -53,16 +54,24 @@ VIR_LOG_INIT("qemu.qemu_blockjob");
>>
>> ...
>>
>> Need to update function comments to add @error
>>
>>>  int
>>>  qemuBlockJobUpdate(virQEMUDriverPtr driver,
>>>                     virDomainObjPtr vm,
>>> -                   virDomainDiskDefPtr disk)
>>> +                   virDomainDiskDefPtr disk,
>>> +                   char **error)
>>>  {
>>>      qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk);
>>>      int status = diskPriv->blockJobStatus;
>>>  
>>> +    if (error)
>>> +        *error = NULL;
>>> +
>>>      if (status != -1) {
>>>          qemuBlockJobEventProcess(driver, vm, disk,
>>>                                   diskPriv->blockJobType,
>>>                                   diskPriv->blockJobStatus);
>>>          diskPriv->blockJobStatus = -1;
>>> +        if (error)
>>> +            VIR_STEAL_PTR(*error, diskPriv->blockJobError);
>>> +        else
>>> +            VIR_FREE(diskPriv->blockJobError);
>>
>> Is the VIR_FREE is necessary since on the "next" usage we'd free or we'd
>> free on private dispose.
> 
> Yes. But is not it considered error prone to keep outdated data?
> 
>>
>> Why only support sync mode? (a reference to Jira's suggestion about
>> adjustment to qemuBlockJobEventProcess).
> 
> Well, blockJobError is set only for sync mode in qemuProcessHandleBlockJob.
> Of course it is possible to add error message to 'struct qemuProcessEvent'
> but what for? We can not pass error message further to user anyway in API event
> without enchancing API.
> 
>>>      }
>>>  
>>>      return status;
>>> @@ -241,6 +250,6 @@ qemuBlockJobSyncEnd(virQEMUDriverPtr driver,
>>>                      virDomainDiskDefPtr disk)
>>>  {
>>>      VIR_DEBUG("disk=%s", disk->dst);
>>> -    qemuBlockJobUpdate(driver, vm, disk);
>>> +    qemuBlockJobUpdate(driver, vm, disk, NULL);
>>>      QEMU_DOMAIN_DISK_PRIVATE(disk)->blockJobSync = false;
>>>  }
>>> diff --git a/src/qemu/qemu_blockjob.h b/src/qemu/qemu_blockjob.h
>>> index 775ce95..c452edc 100644
>>> --- a/src/qemu/qemu_blockjob.h
>>> +++ b/src/qemu/qemu_blockjob.h
>>> @@ -27,7 +27,8 @@
>>>  
>>>  int qemuBlockJobUpdate(virQEMUDriverPtr driver,
>>>                         virDomainObjPtr vm,
>>> -                       virDomainDiskDefPtr disk);
>>> +                       virDomainDiskDefPtr disk,
>>> +                       char **error);
>>>  void qemuBlockJobEventProcess(virQEMUDriverPtr driver,
>>>                                virDomainObjPtr vm,
>>>                                virDomainDiskDefPtr disk,
>>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>>> index 0c9416f..307d83a 100644
>>> --- a/src/qemu/qemu_domain.c
>>> +++ b/src/qemu/qemu_domain.c
>>> @@ -774,6 +774,7 @@ qemuDomainDiskPrivateDispose(void *obj)
>>>  
>>>      qemuDomainSecretInfoFree(&priv->secinfo);
>>>      qemuDomainSecretInfoFree(&priv->encinfo);
>>> +    VIR_FREE(priv->blockJobError);
>>>  }
>>>  
>>>  
>>> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
>>> index dee5d93..c470cbf 100644
>>> --- a/src/qemu/qemu_domain.h
>>> +++ b/src/qemu/qemu_domain.h
>>> @@ -290,6 +290,7 @@ struct _qemuDomainDiskPrivate {
>>>      /* for some synchronous block jobs, we need to notify the owner */
>>>      int blockJobType;   /* type of the block job from the event */
>>>      int blockJobStatus; /* status of the finished block job */
>>> +    char *blockJobError; /* block job completed event error */
>>>      bool blockJobSync; /* the block job needs synchronized termination */
>>>  
>>>      bool migrating; /* the disk is being migrated */
>>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>>> index 8789c9d..bc7f840 100644
>>> --- a/src/qemu/qemu_driver.c
>>> +++ b/src/qemu/qemu_driver.c
>>> @@ -16326,13 +16326,13 @@ qemuDomainBlockJobAbort(virDomainPtr dom,
>>>                                       VIR_DOMAIN_BLOCK_JOB_CANCELED);
>>>          } else {
>>>              qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk);
>>> -            qemuBlockJobUpdate(driver, vm, disk);
>>> +            qemuBlockJobUpdate(driver, vm, disk, NULL);
>>>              while (diskPriv->blockjob) {
>>>                  if (virDomainObjWait(vm) < 0) {
>>>                      ret = -1;
>>>                      goto endjob;
>>>                  }
>>> -                qemuBlockJobUpdate(driver, vm, disk);
>>> +                qemuBlockJobUpdate(driver, vm, disk, NULL);
>>>              }
>>>          }
>>>      }
>>> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
>>> index 1c4a80c..1762648 100644
>>> --- a/src/qemu/qemu_migration.c
>>> +++ b/src/qemu/qemu_migration.c
>>> @@ -1859,17 +1859,25 @@ qemuMigrationDriveMirrorReady(virQEMUDriverPtr driver,
>>>      for (i = 0; i < vm->def->ndisks; i++) {
>>>          virDomainDiskDefPtr disk = vm->def->disks[i];
>>>          qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk);
>>> +        char *error;
>>>  
>>
>> Although you know your function does the *error = NULL; - it's always
>> good practice to just initialize here anyway - better safe than sorry...
>>
>> I'll only mention it once, but it is repeated...
>>
>>>          if (!diskPriv->migrating)
>>>              continue;
>>
>> Perhaps somewhat ironic that this code fetches diskPriv and the API
>> we're about to call fills in error with diskPriv->blockJobError or am I
>> missing something more subtle.
> 
> I think of it as of incapsulation.
> 
>>
>>>  
>>> -        status = qemuBlockJobUpdate(driver, vm, disk);
>>> +        status = qemuBlockJobUpdate(driver, vm, disk, &error);
>>>          if (status == VIR_DOMAIN_BLOCK_JOB_FAILED) {
>>> -            virReportError(VIR_ERR_OPERATION_FAILED,
>>> -                           _("migration of disk %s failed"),
>>> -                           disk->dst);
>>> +            if (error) {
>>> +                virReportError(VIR_ERR_OPERATION_FAILED,
>>> +                               _("migration of disk %s failed: %s"),
>>> +                               disk->dst, error);
>>> +                VIR_FREE(error);
>>> +            } else {
>>> +                virReportError(VIR_ERR_OPERATION_FAILED,
>>> +                               _("migration of disk %s failed"), disk->dst);
>>> +            }
>>
>> I was going to say use NULLSTR(), but I see Jirka already requested
>> separate messages in the review of v1.  Another option would be using
>> the first error message format and then in order to determine what to
>> use for the second %s - error ? error : _("unknown") [although that may
>> send message purists off the deepend].
>>
>>
>>>              return -1;
>>>          }
>>> +        VIR_FREE(error);
>>>  
>>>          if (disk->mirrorState != VIR_DOMAIN_DISK_MIRROR_STATE_READY)
>>>              notReady++;
>>> @@ -1908,17 +1916,23 @@ qemuMigrationDriveMirrorCancelled(virQEMUDriverPtr driver,
>>>      for (i = 0; i < vm->def->ndisks; i++) {
>>>          virDomainDiskDefPtr disk = vm->def->disks[i];
>>>          qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk);
>>> +        char *error;
>>>  
>>>          if (!diskPriv->migrating)
>>>              continue;
>>>  
>>> -        status = qemuBlockJobUpdate(driver, vm, disk);
>>> +        status = qemuBlockJobUpdate(driver, vm, disk, &error);
>>>          switch (status) {
>>>          case VIR_DOMAIN_BLOCK_JOB_FAILED:
>>>              if (check) {
>>> -                virReportError(VIR_ERR_OPERATION_FAILED,
>>> -                               _("migration of disk %s failed"),
>>> -                               disk->dst);
>>> +                if (error) {
>>> +                    virReportError(VIR_ERR_OPERATION_FAILED,
>>> +                                   _("migration of disk %s failed: %s"),
>>> +                                   disk->dst, error);
>>> +                } else {
>>> +                    virReportError(VIR_ERR_OPERATION_FAILED,
>>> +                                   _("migration of disk %s failed"), disk->dst);
>>> +                }
>>>                  failed = true;
>>>              }
>>>              /* fallthrough */
>>> @@ -1931,6 +1945,7 @@ qemuMigrationDriveMirrorCancelled(virQEMUDriverPtr driver,
>>>          default:
>>>              active++;
>>>          }
>>> +        VIR_FREE(error);
>>>      }
>>>  
>>>      if (failed) {
>>> @@ -1968,24 +1983,30 @@ qemuMigrationCancelOneDriveMirror(virQEMUDriverPtr driver,
>>>  {
>>>      qemuDomainObjPrivatePtr priv = vm->privateData;
>>>      char *diskAlias = NULL;
>>> +    char *error;
>>>      int ret = -1;
>>>      int status;
>>>      int rv;
>>>  
>>> -    status = qemuBlockJobUpdate(driver, vm, disk);
>>> +    status = qemuBlockJobUpdate(driver, vm, disk, &error);
>>>      switch (status) {
>>>      case VIR_DOMAIN_BLOCK_JOB_FAILED:
>>>      case VIR_DOMAIN_BLOCK_JOB_CANCELED:
>>>          if (failNoJob) {
>>> -            virReportError(VIR_ERR_OPERATION_FAILED,
>>> -                           _("migration of disk %s failed"),
>>> -                           disk->dst);
>>> -            return -1;
>>> +            if (error) {
>>> +                virReportError(VIR_ERR_OPERATION_FAILED,
>>> +                               _("migration of disk %s failed: %s"),
>>> +                               disk->dst, error);
>>> +            } else {
>>> +                virReportError(VIR_ERR_OPERATION_FAILED,
>>> +                               _("migration of disk %s failed"), disk->dst);
>>> +            }
>>> +            goto cleanup;
>>>          }
>>> -        return 1;
>>> -
>>> +    /* fallthrough */
>>>      case VIR_DOMAIN_BLOCK_JOB_COMPLETED:
>>> -        return 1;
>>> +        ret = 1;
>>> +        goto cleanup;
>>>      }
>>>  
>>>      if (!(diskAlias = qemuAliasFromDisk(disk)))
>>> @@ -2003,6 +2024,7 @@ qemuMigrationCancelOneDriveMirror(virQEMUDriverPtr driver,
>>>  
>>>   cleanup:
>>>      VIR_FREE(diskAlias);
>>> +    VIR_FREE(error);
>>>      return ret;
>>>  }
>>>  
>>> diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
>>> index 5175f4e..34a393a 100644
>>> --- a/src/qemu/qemu_monitor.c
>>> +++ b/src/qemu/qemu_monitor.c
>>> @@ -1458,13 +1458,14 @@ int
>>>  qemuMonitorEmitBlockJob(qemuMonitorPtr mon,
>>>                          const char *diskAlias,
>>>                          int type,
>>> -                        int status)
>>> +                        int status,
>>> +                        const char *error)
>>>  {
>>>      int ret = -1;
>>>      VIR_DEBUG("mon=%p", mon);
>>>  
>>>      QEMU_MONITOR_CALLBACK(mon, ret, domainBlockJob, mon->vm,
>>> -                          diskAlias, type, status);
>>> +                          diskAlias, type, status, error);
>>>      return ret;
>>>  }
>>>  
>>> diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h
>>> index 7d78e5b..522675f 100644
>>> --- a/src/qemu/qemu_monitor.h
>>> +++ b/src/qemu/qemu_monitor.h
>>> @@ -146,6 +146,7 @@ typedef int (*qemuMonitorDomainBlockJobCallback)(qemuMonitorPtr mon,
>>>                                                   const char *diskAlias,
>>>                                                   int type,
>>>                                                   int status,
>>> +                                                 const char *error,
>>>                                                   void *opaque);
>>>  typedef int (*qemuMonitorDomainTrayChangeCallback)(qemuMonitorPtr mon,
>>>                                                     virDomainObjPtr vm,
>>> @@ -332,7 +333,8 @@ int qemuMonitorEmitPMSuspend(qemuMonitorPtr mon);
>>>  int qemuMonitorEmitBlockJob(qemuMonitorPtr mon,
>>>                              const char *diskAlias,
>>>                              int type,
>>> -                            int status);
>>> +                            int status,
>>> +                            const char *error);
>>>  int qemuMonitorEmitBalloonChange(qemuMonitorPtr mon,
>>>                                   unsigned long long actual);
>>>  int qemuMonitorEmitPMSuspendDisk(qemuMonitorPtr mon);
>>> diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c
>>> index b93220b..271598f 100644
>>> --- a/src/qemu/qemu_monitor_json.c
>>> +++ b/src/qemu/qemu_monitor_json.c
>>> @@ -802,6 +802,7 @@ qemuMonitorJSONHandleBlockJobImpl(qemuMonitorPtr mon,
>>>  {
>>>      const char *device;
>>>      const char *type_str;
>>> +    const char *error = NULL;
>>>      int type = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN;
>>>      unsigned long long offset, len;
>>>  
>>> @@ -834,6 +835,7 @@ qemuMonitorJSONHandleBlockJobImpl(qemuMonitorPtr mon,
>>>  
>>>      switch ((virConnectDomainEventBlockJobStatus) event) {
>>>      case VIR_DOMAIN_BLOCK_JOB_COMPLETED:
>>> +        error = virJSONValueObjectGetString(data, "error");
>>>          /* Make sure the whole device has been processed */
>>>          if (offset != len)
>>>              event = VIR_DOMAIN_BLOCK_JOB_FAILED;
>>> @@ -848,7 +850,7 @@ qemuMonitorJSONHandleBlockJobImpl(qemuMonitorPtr mon,
>>>      }
>>>  
>>>   out:
>>> -    qemuMonitorEmitBlockJob(mon, device, type, event);
>>> +    qemuMonitorEmitBlockJob(mon, device, type, event, error);
>>>  }
>>>  
>>>  static void
>>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>>> index 0f5a11b..e5390b4 100644
>>> --- a/src/qemu/qemu_process.c
>>> +++ b/src/qemu/qemu_process.c
>>> @@ -975,6 +975,7 @@ qemuProcessHandleBlockJob(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
>>>                            const char *diskAlias,
>>>                            int type,
>>>                            int status,
>>> +                          const char *error,
>>>                            void *opaque)
>>>  {
>>>      virQEMUDriverPtr driver = opaque;
>>> @@ -996,6 +997,9 @@ qemuProcessHandleBlockJob(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
>>>          /* We have a SYNC API waiting for this event, dispatch it back */
>>>          diskPriv->blockJobType = type;
>>>          diskPriv->blockJobStatus = status;
>>> +        VIR_FREE(diskPriv->blockJobError);
>>> +        if (error && VIR_STRDUP_QUIET(diskPriv->blockJobError, error) < 0)
>>> +            VIR_WARN("Can not pass error message further: %s", error);
>>>          virDomainObjBroadcast(vm);
>>>      } else {
>>>          /* there is no waiting SYNC API, dispatch the update to a thread */
>>>
>>
>> And why not augment qemuProcessEvent too?  If you did, then the
>> VIR_FREE(diskPriv->blockJobError) would go outside the "if" I would
>> think and the eventual call to qemuBlockJobEventProcess in
>> processBlockJobEvent would be able to manage the error.
> 
> Ok. I see. Looks like qemuBlockJobEventProcess is good place to manage error
> not the qemuBlockJobUpdate.

I going to send a new series where this place will stay as it is. Putting error string
managing in qemuBlockJobEventProcess is not convinient as I see it after evaluation.

qemuBlockJobEventProcess have 2 users basically - async and sync blockjob event
processing. Async mode does not pass error in qemuProcessEvent
now but if it is it would use some structure for opaque pointer in qemuProcessEvent. Sync
mode uses fields of private disk structure to keep details of blockjob event. So as
they keep event details differently then the callers should manage the details memory,
not the qemuProcessEvent.

> 
>>
>> It's not the data isn't there is it?
>>
>> John
>>
> 
> --
> libvir-list mailing list
> libvir-list at redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
> 




More information about the libvir-list mailing list