[libvirt] [PATCH 10/11 v2] qemu: drivePivot: Fix assumption when 'block-job-complete' fails

Peter Krempa pkrempa at redhat.com
Wed Apr 1 20:40:51 UTC 2015


QEMU does not abandon the mirror. The job carries on in the synchronised
phase and it might be either pivoted again or cancelled. The commit
hints that the described behavior was happening in a downstream version.

If the command returns false there are two possible options:
1) qemu did not reach the point where it would ask the block job to
pivot
2) pivotting failed in the actual qemu coroutine

If either of those would happen we return failure and reset the
condition that waits for the block job to complete. This makes the API
fail but in case where qemu would actually abandon the mirror the fact
is notified via the event and handled asynchronously.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1202704
---

Notes:
    I've spent some time looking how the active commit and copy  job actually
    works in qemu, but I did not check if that behavior changed in the upstream
    releases. At any rate, it makes sense  thus I expect that it was there ever-since.
    
    Version 2:
    - this version resets the flag that makes libvirt wait on the event. This should
    make the API as rugged as it can possibly be.

 src/qemu/qemu_driver.c | 27 ++++++---------------------
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 2dd8ed4..52c3587 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -16128,27 +16128,10 @@ qemuDomainBlockPivot(virQEMUDriverPtr driver,
     }

     if (ret < 0) {
-        /* On failure, qemu abandons the mirror, and reverts back to
-         * the source disk (RHEL 6.3 has a bug where the revert could
-         * cause catastrophic failure in qemu, but we don't need to
-         * worry about it here as it is not an upstream qemu problem.  */
-        /* XXX should we be parsing the exact qemu error, or calling
-         * 'query-block', to see what state we really got left in
-         * before killing the mirroring job?
-         * XXX We want to revoke security labels and disk lease, as
-         * well as audit that revocation, before dropping the original
-         * source.  But it gets tricky if both source and mirror share
-         * common backing files (we want to only revoke the non-shared
-         * portion of the chain); so for now, we leak the access to
-         * the original.  */
-        virStorageSourceFree(disk->mirror);
-        disk->mirror = NULL;
-        disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_NONE;
-        disk->mirrorJob = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN;
-        disk->blockjob = false;
+        /* The pivot failed. The block job in QEMU remains in the synchronised
+         * phase. Reset the state we changed and return the error to the user */
+        disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_READY;
     }
-    if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm) < 0)
-        ret = -1;

  cleanup:
     if (oldsrc)
@@ -16354,8 +16337,10 @@ qemuDomainBlockJobAbort(virDomainPtr dom,

     if (disk->mirror && (flags & VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT)) {
         ret = qemuDomainBlockPivot(driver, vm, device, disk);
-        if (ret < 0 && async)
+        if (ret < 0 && async) {
+            disk->blockJobSync = false;
             goto endjob;
+        }
         goto waitjob;
     }
     if (disk->mirror) {
-- 
2.2.2




More information about the libvir-list mailing list