[libvirt PATCH 3/3] qemu: block: external snapshot-delete implementation for straightforward cases

Peter Krempa pkrempa at redhat.com
Tue Mar 31 10:06:15 UTC 2020


On Tue, Mar 31, 2020 at 11:18:29 +0200, Pavel Mores wrote:
> Works by blockcommitting the snapshot to be deleted into its parent.  Handles
> both deleting children as well and deleting children only (commits the
> children into the snapshot referred to by snapshot-delete argument).  If the
> necessary block commit operation turns out to be active (the snapshot referred
> to by snapshot-delete argument is current, or deleting children is requested)
> pivot is performed as well.
> 
> This implemetation is limited to the straightforward case which assumes no
> branching snapshot chains below the snapshot to be deleted, and the current
> snapshot has to be the leaf of the (linear) chain starting at the snapshot to
> be deleted.  These requirements are checked and enforced at the top of
> qemuDomainSnapshotDeleteExternal().  They might even be too restrictive for
> the case where deletion of children is not requested as in that case,
> requiring that the parent of the snapshot to be deleted not be a branching
> point in snapshot tree should be enough.
> 
> The above limitations do not appear to be too severe under the current state
> of matters where a major source of branching snapshot structures,
> snapshot-revert, is not even implemented for external snapshots yet.  The only
> other known cause of branching in external snapshots thus seems to be if a
> user creates branching by manually creating snapshot images and metadata and
> applies them using snapshot-create --redefine.  At any rate, this work should
> be understood as just a first step to a full support of deleting external
> snapshots.
> 
> Signed-off-by: Pavel Mores <pmores at redhat.com>
> ---
>  src/qemu/qemu_driver.c | 149 +++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 145 insertions(+), 4 deletions(-)

Few quick comments:

> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index b981745ecf..4d4f3f069f 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -16707,6 +16707,149 @@ qemuDomainMomentReparentChildren(void *payload,
>  }
>  
>  
> +/*
> + * We can't use virDomainMomentFindLeaf() as it takes a
> + * virDomainMomentObjListPtr which we don't have.  It might be a good idea to
> + * move this function to a library of virDomainMomentObj helpers, then
> + * reimplement virDomainMomentFindLeaf() in terms of this function as it only
> + * uses its virDomainMomentObjListPtr parameter to fish a virDomainMomentObjPtr
> + * out of it.  Then it just runs this function's algorithm on it.

Please do so ...

> + */
> +static virDomainMomentObjPtr
> +myDomainMomentFindLeaf(virDomainMomentObjPtr moment)

... since this does not conform to the naming scheme.

> +{
> +    if (moment->nchildren != 1)
> +        return NULL;
> +    while (moment->nchildren == 1)
> +        moment = moment->first_child;
> +    if (moment->nchildren == 0)
> +        return moment;
> +    return NULL;
> +}
> +
> +
> +static int
> +qemuDomainSnapshotDeleteExternal(virDomainObjPtr vm,
> +                                 virQEMUDriverPtr driver,
> +                                 virDomainMomentObjPtr snap,
> +                                 unsigned int flags)
> +{
> +    int ret = -1;
> +    bool isActive;
> +    virDomainSnapshotDefPtr snapdef = virDomainSnapshotObjGetDef(snap);
> +    virDomainMomentObjPtr leaf = snap->nchildren ? myDomainMomentFindLeaf(snap) : snap;

Please refrain from using ternary operators here.

> +    virDomainMomentObjPtr parent = snap->parent;
> +    virDomainSnapshotDefPtr parentdef = virDomainSnapshotObjGetDef(parent);
> +    size_t i;
> +
> +    /* This function only works if the chain below 'snap' is linear.  If
> +     * there's no unique leaf it means the chain of 'snap's children
> +     * branches at some point.  Also, if there *is* a leaf but it's not
> +     * the current snapshot, bail out as well. */
> +    if (leaf == NULL) {
> +        virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
> +                       "%s", _("can't delete, snapshot chain branches"));
> +        goto cleanup;
> +    }
> +    if (leaf != virDomainSnapshotGetCurrent(vm->snapshots)) {

Btw, vm->snapshots.base is the 'virDomainMomentObjListPtr' you were
looking for above.

> +        virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
> +                       "%s", _("can't delete, leaf snapshot is not current"));
> +        goto cleanup;
> +    }

Check that VM is active should be added here, so that we can bail out
sooner than when we try to actually commit.

> +
> +    if (flags & (VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN |
> +                 VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN_ONLY)) {
> +        isActive = true;
> +    } else {
> +        isActive = snap == virDomainSnapshotGetCurrent(vm->snapshots);
> +    }
> +
> +    for (i = 0; i < snapdef->ndisks; i++) {
> +        virDomainSnapshotDiskDefPtr snapdisk = &(snapdef->disks[i]);
> +        const char *diskName = snapdisk->name;
> +        const char *basePath = NULL;
> +        const char *topPath = snapdisk->src->path;
> +        int blockCommitFlags = VIR_DOMAIN_BLOCK_COMMIT_DELETE;
> +        virDomainDiskDefPtr disk;
> +
> +        if (!(disk = qemuDomainDiskByName(vm->def, diskName)))
> +            goto cleanup;

This loop is too complex as it does the merging itself. If any of these
fails you have a half-deleted snapshshot which will be very hard to clean up
later. All preparation and validation steps must be extracted and done
prior to actually doing the commiting.

Specifically a case when this would break is e.g. if a disk is unplugged
after a snapshot. This code will attempt looking it up but that will
fail.

> +
> +        if (flags & VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN_ONLY) {
> +            basePath = snapdisk->src->path;
> +            topPath = disk->src->path;
> +        } else {
> +            if (parent->nchildren == 1) {
> +                if (parentdef == NULL) {
> +                    virStorageSourcePtr baseSrc = virStorageFileChainLookup(disk->src, NULL, NULL, 0, NULL);
> +                    if (!baseSrc)
> +                        goto cleanup;
> +                    basePath = baseSrc->path;
> +                } else {
> +                    virDomainSnapshotDiskDefPtr parentdisk = &(parentdef->disks[i]);
> +                    basePath = parentdisk->src->path;
> +                }
> +            } else {
> +                /* TODO 'snap's parent has multiple children, meaning it's a
> +                 * branching point in snapshot tree.  This means we can't
> +                 * delete 'snap' by commiting into its parent as doing so would
> +                 * corrupt the other branches rooted in the parent.  We might
> +                 * still be able to delete 'snap' though by pulling into its
> +                 * child/children. */
> +
> +                virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s",
> +                               _("can't delete, parent has multiple children"));

Missing jump/handling of error.

> +            }
> +
> +            if (flags & VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN)
> +                topPath = disk->src->path;
> +            else
> +                topPath = snapdisk->src->path;

You must not use paths for doing this lookup. Paths at best work for
local files only and this would make the code not future proof.

Also you want to verify that:
- images you want to merge are actually in the backing chain
- the backing chain looks as snapshots describe it (e.g you unplug vda
  and plug a different chain back)

Doing the validation above will necessarily give you a
virStorageSourcePtr for the appropriate member of the backing chain and
that one should be used as argument for the commit operation.

> +        }
> +
> +        if (isActive)
> +            blockCommitFlags |= VIR_DOMAIN_BLOCK_COMMIT_ACTIVE;
> +
> +        if (qemuDomainBlockCommitImpl(vm, driver, diskName, basePath, topPath,
> +                                      0, blockCommitFlags) < 0)
> +            goto cleanup;

This is a long running operation and thus should be handled via an
async job. Otherwise for the full duration of multiple sequential block
commit operations the VM object will be locked which will prevent
callers from e.g. looking up statistics of the VM or job or potentially
aborting the job.

It is okay in this instance for the API to block for the duration but
I'd strongly suggest you add a flag which will cause the API to
terminate after kicking off the block commit operations and then use the
async job infrastructure to handle the completion notification. Since
adding the parameter is not strictly required for the conversion of the
deletion API to async job as we can keep the API-blocking semantics this
can be done separately.

Also consider kicking off the commits in parallel.

> +
> +        if (isActive) {
> +            /* wait for the blockcommit job to finish (in particular, reach the
> +             * QEMU_BLOCKJOB_STATE_READY state)... */
> +            qemuBlockJobDataPtr job;
> +
> +            if (!(job = qemuBlockJobDiskGetJob(disk))) {
> +                virReportError(VIR_ERR_INVALID_ARG,
> +                            _("disk %s does not have an active block job"), disk->dst);
> +                goto cleanup;
> +            }
> +
> +            qemuBlockJobSyncBegin(job);
> +            qemuBlockJobUpdate(vm, job, QEMU_ASYNC_JOB_NONE);
> +            while (job->state != QEMU_BLOCKJOB_STATE_READY) {

This will loop forever if the job e.g. fails or is aborted before
reaching the 'READY' state.

> +                if (virDomainObjWait(vm) < 0) {
> +                    ret = -1;
> +                    goto cleanup;
> +                }
> +                qemuBlockJobUpdate(vm, job, QEMU_ASYNC_JOB_NONE);
> +            }
> +            qemuBlockJobSyncEnd(vm, job, QEMU_ASYNC_JOB_NONE);

Reference to 'job' obtained by qemuBlockJobDiskGetJob is leaked here.

> +
> +            /* ... and pivot */
> +            if (qemuDomainBlockJobAbortImpl(driver, vm, diskName,
> +                                            VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT) < 0)
> +                goto cleanup;
> +        }

For nonActive case, the code is not waiting for the completion of the
commit job, thus potentially returns success even if the commit itself
fails.

> +    }
> +
> +    ret = 0;
> +
> + cleanup:

No point to have a cleanup label if there's no code.

> +    return ret;
> +}




More information about the libvir-list mailing list