[libvirt] [PATCH 1/5] snapshot: Implement reverting for external disk snapshots

Povilas Kanapickas povilas at radix.lt
Fri Nov 9 18:58:52 UTC 2018


Hi Peter,

Many thanks for the reviews. I replied inline below.

On 07/11/2018 17:40, Peter Krempa wrote:
> On Sun, Oct 21, 2018 at 19:38:48 +0300, Povilas Kanapickas wrote:
> [...]
>> +                                           int* out_mapping)
>> +{
>> +    size_t i, j;
>> +    int found_idx;
>> +    virDomainSnapshotDiskDefPtr snap_disk;
>> +
>> +    for (i = 0; i < snap_def->ndisks; ++i) {
>> +        snap_disk = &(snap_def->disks[i]);
>> +        if (snap_disk->snapshot != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)
>> +            continue;
>> +
>> +        found_idx = -1;
>> +        for (j = 0; j < dom_def->ndisks; ++j) {
>> +            if (STREQ(dom_def->disks[j]->dst, snap_disk->name)) {
>> +                found_idx = j;
>> +                break;
>> +            }
>> +        }
>> +        out_mapping[i] = found_idx;
>> +    }
>> +}
>> +
>> +/* This function reverts an external snapshot disk state. With external disks
>> +   we can't just revert to the disks listed in the domain stored within the
>> +   snapshot, because it's read-only and might be used by other VMs in different
>> +   backing chains. Since the contents of the current disks will be discarded
>> +   in favor of data in the snapshot, we reuse them by resetting them and
>> +   pointing the backing image link to the disks listed within the snapshot.
>> +
>> +   The domain is expected to be inactive.
>> +
>> +   new_def is the new domain definition that will be stored to vm sometime in
>> +   the future.
>> + */
>> +static int
>> +qemuDomainSnapshotRevertInactiveExternal(virQEMUDriverPtr driver,
>> +                                         virDomainObjPtr vm,
>> +                                         virDomainSnapshotObjPtr snap,
>> +                                         virDomainDefPtr new_def)
>> +{
>> +    /* We have the following disks recorded in the snapshot and the current
>> +       domain definition:
>> +
>> +        - the current disk state before the revert (vm->def->disks). We'll
>> +          discard and reuse them.
> 
> So this has multiple problems:
> 
> 1) You can have a chain of snapshots and revert in middle of that:
>     - this will destroy any snapshots which depend on the one you are
>       reverting to and that is not acceptable

The intention was to drop just the most current disk, the one that does
not have a snapshot yet. Then we can create another file and point its
backing image to the disk storing the snapshot that we want to revert
to. As far as I understand, this does not affect any later snapshots in
the chain.

> 2) The snapshot can have a different mapping of disks. When reverting
> the old topology should be kept as it existed at the time when the
> snapshot was created.

Agreed. But what would be the best outcome in situations when a snapshot
does not include a certain disk? My thinking is that if user does not
include a disk into a snapshot then he doesn't care about the change of
its contents since the last snapshot. Thus we could iterate the parent
snapshots until we find one that includes that disk and then revert the
state to that disk. If neither the snapshot or its parents include that
disk then we could probably leave the current disk state unchanged.

> 3) you can't return to the state that you are leaving (might be
> desirable but needs to be opt-in)
>
> This basically means that we need a new reversion API which will allow
> to take an XML and will basically fork the snapshot into an alternate
> history and also will mark the state you are leaving so that we can
> return to that state.
> 
> Reverting to that "new" state will result into just swithcing the
> definitions and continuing to use the images and make changes. (Which
> should be done for any leaf snapshot in the snapshot tree).
> 
> Deleting the state as you do is IMO not acceptable.

I think we could simplify that a lot if we accepted that revert is by
definition a lossy operation and any changes since last snapshot would
be lost.

Otherwise storing the previous state would just complicate things:
consider the situation where we are repeatedly reverting back and forth
between two snapshots. Either we will need to save all states between
the reverts and implement ability to store potentially more than one
previous state as a child of any snapshot. Or we will need to accept
that at some point data loss is acceptable and, say, limit the number of
state children of each snapshot to one.

I suggest that if we accepted snapshot operation as lossy, then your use
case could be implemented by a pair of <take snapshot> and <lossy revert
to other snapshot> operations. Bonus would be that such snapshots
pointing to previous states could be manipulated by existing APIs, so
tools won't need to be updated and so on.

The main problem I see currently is that user needs to somehow specify
new paths for all disks in the snapshot he wants to revert to. All paths
listed in the snapshot currently may be part of other snapshot chains.
I've written a bit about this issue and proposed solutions in this
email:
https://www.redhat.com/archives/libvir-list/2018-October/msg01110.html .
Have you looked into it by chance?

Thank you a lot for your time!

Regards,
Povilas

> 
>> +        - the lists of disks that were snapshotted (snap->def->disks). We'll
>> +          use this information to figure out which disks to actually revert.
>> +        - the original disk state stored in the snapshot
>> +          (snap->def->dom->disks). We'll point reverted disks to use these
>> +          disks as backing images.
>> +
>> +        FIXME: what to do with disks that weren't included in all snapshots
>> +        within the hierrachy?
>> +    */
>> +    size_t i;
>> +    int* snap_disk_to_vm_def_disk_idx_map = NULL;
>> +    int* snap_disk_to_new_def_disk_idx_map = NULL;
>> +    int ret = -1;
>> +    virDomainSnapshotDiskDefPtr snap_disk;
>> +    virDomainDiskDefPtr backing_disk;
>> +    virDomainDiskDefPtr revert_disk;
>> +    virDomainDiskDefPtr new_disk;
>> +
>> +    if (VIR_ALLOC_N(snap_disk_to_vm_def_disk_idx_map, snap->def->ndisks) < 0)
>> +        goto cleanup;
>> +    if (VIR_ALLOC_N(snap_disk_to_new_def_disk_idx_map, snap->def->ndisks) < 0)
>> +        goto cleanup;
>> +
>> +    /* Figure out the matching disks from the current VM state. */
>> +    qemuComputeSnapshotDiskToDomainDiskMapping(snap->def, vm->def,
>> +                                               snap_disk_to_vm_def_disk_idx_map);
>> +    qemuComputeSnapshotDiskToDomainDiskMapping(snap->def, new_def,
>> +                                               snap_disk_to_new_def_disk_idx_map);
>> +
>> +    for (i = 0; i < snap->def->ndisks; ++i) {
>> +        snap_disk = &(snap->def->disks[i]);
>> +        if (snap_disk->snapshot != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)
>> +            continue;
>> +        if (snap_disk_to_vm_def_disk_idx_map[i] < 0 ||
>> +            snap_disk_to_new_def_disk_idx_map[i] < 0) {
>> +            // FIXME: we could create additional disks, but for now it's not
>> +            // supported.
>> +            virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
>> +                           _("all disks in the snapshots must have "
>> +                             "equivalents in the current VM state."));
>> +            goto cleanup;
>> +        }
>> +    }
>> +
>> +    /* Discard old disk contents and point them to the backing disks */
>> +    for (i = 0; i < snap->def->ndisks; ++i) {
>> +        snap_disk = &(snap->def->disks[i]);
>> +
>> +        // Note that at the moment we don't support mixing internal and
>> +        // external snapshot modes for different disks, but skip non-external
>> +        // disks just in case.
>> +        if (snap_disk->snapshot != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)
>> +            continue;
>> +
>> +        backing_disk = snap->def->dom->disks[snap_disk->idx];
>> +        revert_disk = vm->def->disks[snap_disk_to_vm_def_disk_idx_map[i]];
>> +        if (qemuDomainSnapshotRevertExternalSingleDisk(driver, revert_disk,
>> +                                                       backing_disk) < 0) {
>> +            goto cleanup;
>> +        }
>> +
>> +        /* FIXME: figure out which data exactly needs copying.
>> +         */
>> +        new_disk = new_def->disks[snap_disk_to_new_def_disk_idx_map[i]];
>> +        new_disk->src->format = revert_disk->src->format;
>> +        if (VIR_STRDUP(new_disk->src->path, revert_disk->src->path) < 0) {
>> +            goto cleanup;
>> +        }
>> +    }
>> +    ret = 0;
>> +
>> +cleanup:
>> +    VIR_FREE(snap_disk_to_vm_def_disk_idx_map);
>> +    VIR_FREE(snap_disk_to_new_def_disk_idx_map);
>> +    return ret;
>> +}
>>  
>>  /* The domain is expected to be locked and inactive. */
>>  static int
>> -qemuDomainSnapshotRevertInactive(virQEMUDriverPtr driver,
>> -                                 virDomainObjPtr vm,
>> -                                 virDomainSnapshotObjPtr snap)
>> +qemuDomainSnapshotRevertInactiveInternal(virQEMUDriverPtr driver,
>> +                                         virDomainObjPtr vm,
>> +                                         virDomainSnapshotObjPtr snap)
> 
> Please separate refactors from feature implementation.
> 
>>  {
>>      /* Try all disks, but report failure if we skipped any.  */
>>      int ret = qemuDomainSnapshotForEachQcow2(driver, vm, snap, "-a", true);
>>      return ret > 0 ? -1 : ret;
>>  }
>>  
>> -
>>  static int
>>  qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>                             unsigned int flags)
>> @@ -15981,6 +16156,7 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>      virDomainDefPtr config = NULL;
>>      virQEMUDriverConfigPtr cfg = NULL;
>>      virCapsPtr caps = NULL;
>> +    bool is_external = false;
>>      bool was_stopped = false;
>>      qemuDomainSaveCookiePtr cookie;
>>      virCPUDefPtr origCPU = NULL;
>> @@ -16043,11 +16219,13 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>          goto endjob;
>>      }
>>  
>> -    if (virDomainSnapshotIsExternal(snap)) {
>> +    if (snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) {
>>          virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
>> -                       _("revert to external snapshot not supported yet"));
>> +                       _("revert to external snapshot with memory state is "
>> +                         "not supported yet"));
> 
> Fair enough in RFC state, but kind of useless otherwise.
> 
>>          goto endjob;
>>      }
>> +    is_external = virDomainSnapshotIsExternal(snap);
>>  
>>      if (!(flags & VIR_DOMAIN_SNAPSHOT_REVERT_FORCE)) {
>>          if (!snap->def->dom) {
>> @@ -16090,6 +16268,11 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>                                    driver->xmlopt, NULL, true);
>>          if (!config)
>>              goto endjob;
>> +    } else if (is_external) {
>> +        virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
>> +                       _("revert to a an external snapshot without the VM "
>> +                         "definition recorded is not supported."));
> 
> This can't happen. External snapshot without definition can't exist
> since libvirt already tracked definitions when external snapshots were
> added.
> 
>> +        goto endjob;
>>      }
>>  
>>      cookie = (qemuDomainSaveCookiePtr) snap->def->cookie;
>> @@ -16110,8 +16293,8 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>              /* Transitions 5, 6, 8, 9 */
>>              /* Check for ABI compatibility. We need to do this check against
>>               * the migratable XML or it will always fail otherwise */
>> +            bool compatible = true;
>>              if (config) {
>> -                bool compatible;
>>  
>>                  /* Replace the CPU in config and put the original one in priv
>>                   * once we're done. When we have the updated CPU def in the
>> @@ -16152,22 +16335,27 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>                          goto endjob;
>>                      }
>>                      virResetError(err);
>> -                    qemuProcessStop(driver, vm,
>> -                                    VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT,
>> -                                    QEMU_ASYNC_JOB_START, 0);
>> -                    virDomainAuditStop(vm, "from-snapshot");
>> -                    detail = VIR_DOMAIN_EVENT_STOPPED_FROM_SNAPSHOT;
>> -                    event = virDomainEventLifecycleNewFromObj(vm,
>> -                                                     VIR_DOMAIN_EVENT_STOPPED,
>> -                                                     detail);
>> -                    virObjectEventStateQueue(driver->domainEventState, event);
>> -                    /* Start after stop won't be an async start job, so
>> -                     * reset to none */
>> -                    jobType = QEMU_ASYNC_JOB_NONE;
>> -                    goto load;
>>                  }
>>              }
>>  
>> +            if (!compatible || is_external) {
>> +                // If the snapshot is external we completely stop QEMU, adjust
>> +                // the disk state and restart it.
>> +                qemuProcessStop(driver, vm,
>> +                                VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT,
>> +                                QEMU_ASYNC_JOB_START, 0);
>> +                virDomainAuditStop(vm, "from-snapshot");
>> +                detail = VIR_DOMAIN_EVENT_STOPPED_FROM_SNAPSHOT; 
>> +                event = virDomainEventLifecycleNewFromObj(vm,
>> +                                                 VIR_DOMAIN_EVENT_STOPPED,
>> +                                                 detail);
>> +                virObjectEventStateQueue(driver->domainEventState, event);
>> +                /* Start after stop won't be an async start job, so
>> +                 * reset to none */
>> +                jobType = QEMU_ASYNC_JOB_NONE;
>> +                goto load;
>> +            }
>> +
>>              priv = vm->privateData;
>>              if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) {
>>                  /* Transitions 5, 6 */
> 
> [...]
> 
> 
>> @@ -16221,7 +16415,8 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>  
>>              rc = qemuProcessStart(snapshot->domain->conn, driver, vm,
>>                                    cookie ? cookie->cpu : NULL,
>> -                                  jobType, NULL, -1, NULL, snap,
>> +                                  jobType, NULL, -1, NULL,
>> +                                  is_external ? NULL : snap,
> 
> I suppose this misses support for active snapshots with memory, where
> also the memory state needs to be reverted.
> 
>>                                    VIR_NETDEV_VPORT_PROFILE_OP_CREATE,
>>                                    start_flags);
>>              virDomainAuditStart(vm, "from-snapshot", rc >= 0);
>>
>>
>> --
>> libvir-list mailing list
>> libvir-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/libvir-list





More information about the libvir-list mailing list