[libvirt] [PATCHv2 12/16] storage: cache backing chain while qemu domain is live

Laine Stump laine at laine.org
Tue Oct 16 19:02:55 UTC 2012


On 10/13/2012 06:00 PM, Eric Blake wrote:
> Technically, we should not be re-probing any file that qemu might
> be currently writing to.  As such, we should cache the backing
> file chain prior to starting qemu.  This patch adds the cache,
> but does not use it until the next patch.
>
> Ultimately, we want to also store the chain in domain XML, so that
> it is remembered across libvirtd restarts, and so that the only
> kosher way to modify the backing chain of an offline domain will be
> through libvirt API calls, but we aren't there yet.  So for now, we
> merely invalidate the cache any time we do a live operation that
> alters the chain (block-pull, block-commit, external disk snapshot),
> as well as tear down the cache when the domain is not running.
>
> * src/conf/domain_conf.h (_virDomainDiskDef): New field.
> * src/conf/domain_conf.c (virDomainDiskDefFree): Clean new field.
> * src/qemu/qemu_domain.h (qemuDomainDetermineDiskChain): New
> prototype.
> * src/qemu/qemu_domain.c (qemuDomainDetermineDiskChain): New
> function.
> * src/qemu/qemu_driver.c (qemuDomainAttachDeviceDiskLive)
> (qemuDomainChangeDiskMediaLive): Pre-populate chain.
> (qemuDomainSnapshotCreateSingleDiskActive): Uncache chain before
> snapshot.
> * src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Update
> chain after block pull.
> (qemuProcessStop): Uncache chain after process ends.
> ---
>  src/conf/domain_conf.c  |  1 +
>  src/conf/domain_conf.h  |  2 ++
>  src/qemu/qemu_domain.c  | 33 +++++++++++++++++++++++++++++++++
>  src/qemu/qemu_domain.h  |  3 +++
>  src/qemu/qemu_driver.c  | 14 ++++++++++++++
>  src/qemu/qemu_process.c | 20 ++++++++++++++++++++
>  6 files changed, 73 insertions(+)
>
> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index a1d5a1a..f4c05a3 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
> @@ -971,6 +971,7 @@ void virDomainDiskDefFree(virDomainDiskDefPtr def)
>      VIR_FREE(def->src);
>      VIR_FREE(def->dst);
>      VIR_FREE(def->driverName);
> +    virStorageFileFreeMetadata(def->chain);

Do you think this name is descriptive enough? (Probably for you, but for
someone looking at domain_conf.c for the first time...)

>      VIR_FREE(def->mirror);
>      VIR_FREE(def->auth.username);
>      VIR_FREE(def->wwn);
> diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
> index 1d20522..06c1ccd 100644
> --- a/src/conf/domain_conf.h
> +++ b/src/conf/domain_conf.h
> @@ -47,6 +47,7 @@
>  # include "virobject.h"
>  # include "device_conf.h"
>  # include "bitmap.h"
> +# include "storage_file.h"
>
>  /* forward declarations of all device types, required by
>   * virDomainDeviceDef
> @@ -568,6 +569,7 @@ struct _virDomainDiskDef {
>      } auth;
>      char *driverName;
>      int format; /* enum virStorageFileFormat */
> +    virStorageFileMetadataPtr chain;
>
>      char *mirror;
>      int mirrorFormat; /* enum virStorageFileFormat */
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index b51edc2..9675454 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -2010,3 +2010,36 @@ qemuDomainCleanupRun(struct qemud_driver *driver,
>      priv->ncleanupCallbacks = 0;
>      priv->ncleanupCallbacks_max = 0;
>  }
> +
> +int
> +qemuDomainDetermineDiskChain(struct qemud_driver *driver,
> +                             virDomainDiskDefPtr disk,
> +                             bool force)
> +{
> +    int format;
> +
> +    if (!disk->src)
> +        return 0;
> +
> +    if (disk->chain) {
> +        if (force) {
> +            virStorageFileFreeMetadata(disk->chain);
> +            disk->chain = NULL;
> +        } else {
> +            return 0;
> +        }
> +    }
> +    if (disk->format > 0)
> +        format = disk->format;
> +    else if (driver->allowDiskFormatProbing)
> +        format = VIR_STORAGE_FILE_AUTO;
> +    else
> +        format = VIR_STORAGE_FILE_RAW;
> +
> +    disk->chain = virStorageFileGetMetadata(disk->src, format,
> +                                            driver->user, driver->group,
> +                                            driver->allowDiskFormatProbing);
> +    if (!disk->chain && !force)
> +        return -1;
> +    return 0;


I would naively think that if force was true, you would want to return
failure if you couldn't get the chain, you're returning failure only if
force is false. Since that seems counterintuitive to me, I thought I
should point it out (very likely it's correct and I again just don't
understand, but better safe than sorry :-)

> +}
> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
> index 26b6c55..8a66f14 100644
> --- a/src/qemu/qemu_domain.h
> +++ b/src/qemu/qemu_domain.h
> @@ -343,6 +343,9 @@ bool qemuDomainJobAllowed(qemuDomainObjPrivatePtr priv,
>  int qemuDomainCheckDiskPresence(struct qemud_driver *driver,
>                                  virDomainObjPtr vm,
>                                  bool start_with_state);
> +int qemuDomainDetermineDiskChain(struct qemud_driver *driver,
> +                                 virDomainDiskDefPtr disk,
> +                                 bool force);
>
>  int qemuDomainCleanupAdd(virDomainObjPtr vm,
>                           qemuDomainCleanupCallback cb);
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 0e7665e..6588b82 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -5705,6 +5705,9 @@ qemuDomainAttachDeviceDiskLive(virConnectPtr conn,
>          goto end;
>      }
>
> +    if (qemuDomainDetermineDiskChain(driver, disk, false) < 0)
> +        goto end;
> +

.. after all - here you have force = false, and expect that it might
fail, so I'm probably wrong...

>      if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) {
>          if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0)) {
>              virReportError(VIR_ERR_INTERNAL_ERROR,
> @@ -5933,6 +5936,9 @@ qemuDomainChangeDiskMediaLive(virDomainObjPtr vm,
>      virCgroupPtr cgroup = NULL;
>      int ret = -1;
>
> +    if (qemuDomainDetermineDiskChain(driver, disk, false) < 0)
> +        goto end;
> +
>      if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) {
>          if (virCgroupForDomain(driver->cgroup,
>                                 vm->def->name, &cgroup, 0) != 0) {
> @@ -10677,6 +10683,14 @@ qemuDomainSnapshotCreateSingleDiskActive(struct qemud_driver *driver,
>      disk->src = source;
>      origdriver = disk->format;
>      disk->format = VIR_STORAGE_FILE_RAW; /* Don't want to probe backing files */
> +    /* XXX Here, we know we are about to alter disk->chain if
> +     * successful, so we nuke the existing chain so that future
> +     * commands will recompute it.  Better would be storing the chain
> +     * ourselves rather than reprobing, but this requires modifying
> +     * domain_conf and our XML to fully track the chain across
> +     * libvirtd restarts.  */

So are you avoiding this for now just to reduce the complexity of this
series? Or is there some other reason that it's not practical?

> +    virStorageFileFreeMetadata(disk->chain);
> +    disk->chain = NULL;
>
>      if (virDomainLockDiskAttach(driver->lockManager, driver->uri,
>                                  vm, disk) < 0)
> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index f8a2bfd..58cd8da 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c
> @@ -909,6 +909,14 @@ qemuProcessHandleBlockJob(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
>      if (disk) {
>          path = disk->src;
>          event = virDomainEventBlockJobNewFromObj(vm, path, type, status);
> +        /* XXX If we completed a block pull, then recompute the cached
> +         * backing chain to match.  Better would be storing the chain
> +         * ourselves rather than reprobing, but this requires
> +         * modifying domain_conf and our XML to fully track the chain
> +         * across libvirtd restarts.  */
> +        if (type == VIR_DOMAIN_BLOCK_JOB_TYPE_PULL &&
> +            status == VIR_DOMAIN_BLOCK_JOB_COMPLETED)
> +            qemuDomainDetermineDiskChain(driver, disk, true);
>      }
>
>      virDomainObjUnlock(vm);
> @@ -4062,6 +4070,18 @@ void qemuProcessStop(struct qemud_driver *driver,
>          networkReleaseActualDevice(net);
>      }
>
> +    /* XXX For now, disk chains should only be cached while qemu is
> +     * running.  Since we don't track the chain in XML, a user is free
> +     * to update the chain while the domain is offline, and then when
> +     * they next boot the domain we should re-read the chain from the
> +     * files at that point in time.  Only when we track the chain in
> +     * XML can we forbid the user from altering the chain of an
> +     * offline domain.  */

Well, there are two possible levels of caching: 1) cache it in the
domain's status for as long as the domain is running (this would
alleviate the need to recompute it when libvirtd is restarted), 2) cache
it in the domain's persistent config, so that it survives restarts of
the domain. Is it used so much that either level of caching would
actually be a substantial gain?

> +    for (i = 0; i < def->ndisks; i++) {
> +        virStorageFileFreeMetadata(def->disks[i]->chain);
> +        def->disks[i]->chain = NULL;
> +    }
> +

If you're storing the cache in vm->def rather than vm->newDef, shouldn't
this be taken care of automatically? (At domain startup in
qemuProcessStart(), virDomainObjSetDefTransient() is called, which makes
a copy of vm->def into vm->newDef, then at domain shutdown in
qemuProcessStop() (right at the bottom of the function), vm->def is
freed, and vm->newDef is moved into its place.

Since you're only ever populating the chain in vm->def if the domain is
running, and since vm->newDef always exists when the domain is running
(unless it's a transient domain, in which case we don't care), you can
be guaranteed that vm->def will *always* be freed during qemuProcessStop
anyway, making the above loop redundant.


>  retry:
>      if ((ret = qemuRemoveCgroup(driver, vm, 0)) < 0) {
>          if (ret == -EBUSY && (retries++ < 5)) {

In the end, the only real issue I see is the extra unneeded loop right
there at the bottom. Other than that, if there's a better name to use in
the virDomainDiskDef in place of simply "chain" that would be nice, but
not necessary.




More information about the libvir-list mailing list