[libvirt] [PATCH 10/11 v2] qemu: drivePivot: Fix assumption when 'block-job-complete' fails

John Ferlan jferlan at redhat.com
Wed Apr 8 17:51:57 UTC 2015



On 04/01/2015 04:40 PM, Peter Krempa wrote:
> QEMU does not abandon the mirror. The job carries on in the synchronised
> phase and it might be either pivoted again or cancelled. The commit
> hints that the described behavior was happening in a downstream version.
> 
> If the command returns false there are two possible options:
> 1) qemu did not reach the point where it would ask the block job to
> pivot
> 2) pivotting failed in the actual qemu coroutine
> 
> If either of those would happen we return failure and reset the
> condition that waits for the block job to complete. This makes the API
> fail but in case where qemu would actually abandon the mirror the fact
> is notified via the event and handled asynchronously.
> 
> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1202704
> ---
> 
> Notes:
>     I've spent some time looking how the active commit and copy  job actually
>     works in qemu, but I did not check if that behavior changed in the upstream
>     releases. At any rate, it makes sense  thus I expect that it was there ever-since.
>     
>     Version 2:
>     - this version resets the flag that makes libvirt wait on the event. This should
>     make the API as rugged as it can possibly be.
> 
>  src/qemu/qemu_driver.c | 27 ++++++---------------------
>  1 file changed, 6 insertions(+), 21 deletions(-)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 2dd8ed4..52c3587 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -16128,27 +16128,10 @@ qemuDomainBlockPivot(virQEMUDriverPtr driver,
>      }
> 
>      if (ret < 0) {
> -        /* On failure, qemu abandons the mirror, and reverts back to
> -         * the source disk (RHEL 6.3 has a bug where the revert could
> -         * cause catastrophic failure in qemu, but we don't need to
> -         * worry about it here as it is not an upstream qemu problem.  */
> -        /* XXX should we be parsing the exact qemu error, or calling
> -         * 'query-block', to see what state we really got left in
> -         * before killing the mirroring job?
> -         * XXX We want to revoke security labels and disk lease, as
> -         * well as audit that revocation, before dropping the original
> -         * source.  But it gets tricky if both source and mirror share
> -         * common backing files (we want to only revoke the non-shared
> -         * portion of the chain); so for now, we leak the access to
> -         * the original.  */
> -        virStorageSourceFree(disk->mirror);
> -        disk->mirror = NULL;
> -        disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_NONE;
> -        disk->mirrorJob = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN;
> -        disk->blockjob = false;
> +        /* The pivot failed. The block job in QEMU remains in the synchronised
> +         * phase. Reset the state we changed and return the error to the user */
> +        disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_READY;
>      }



I won't pretend to say I understand all the comments that get deleted in
this, but I assume they were reacting to various "changes" as time went
on with perhaps a final decision at some point in time to

So I'd say a weak ACK only because I don't have all the history on this
nor do I have in mind all the various states...  Although since I do see
the code checks a few lines above:

    if (disk->mirrorState != VIR_DOMAIN_DISK_MIRROR_STATE_READY) {
         error...
    }

returning mirrorState back to READY does seem logical to me.

My only other comment would be you could move the comment outside the {}
and then remove the {}'s...

John

FWIW:
Depending on the dictionary one uses - synchronized or canceled may be
used, but using synchronised and cancelled seems to go against the will
of Thunderbird's spell checker ;-)

> -    if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm) < 0)
> -        ret = -1;
> 
>   cleanup:
>      if (oldsrc)
> @@ -16354,8 +16337,10 @@ qemuDomainBlockJobAbort(virDomainPtr dom,
> 
>      if (disk->mirror && (flags & VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT)) {
>          ret = qemuDomainBlockPivot(driver, vm, device, disk);
> -        if (ret < 0 && async)
> +        if (ret < 0 && async) {
> +            disk->blockJobSync = false;
>              goto endjob;
> +        }
>          goto waitjob;
>      }
>      if (disk->mirror) {
> 




More information about the libvir-list mailing list