[libvirt] [PATCH 2/8] backup: Document nuances between different state capture APIs

Eric Blake eblake at redhat.com
Wed Jun 13 16:42:23 UTC 2018


Upcoming patches will add support for incremental backups via
a new API; but first, we need a landing page that gives an
overview of capturing various pieces of guest state, and which
APIs are best suited to which tasks.

Signed-off-by: Eric Blake <eblake at redhat.com>
---
 docs/docs.html.in               |   5 ++
 docs/domainstatecapture.html.in | 190 ++++++++++++++++++++++++++++++++++++++++
 docs/formatsnapshot.html.in     |   2 +
 3 files changed, 197 insertions(+)
 create mode 100644 docs/domainstatecapture.html.in

diff --git a/docs/docs.html.in b/docs/docs.html.in
index 40e0e3b82e..4c46b74980 100644
--- a/docs/docs.html.in
+++ b/docs/docs.html.in
@@ -120,6 +120,11 @@

         <dt><a href="secureusage.html">Secure usage</a></dt>
         <dd>Secure usage of the libvirt APIs</dd>
+
+        <dt><a href="domainstatecapture.html">Domain state
+            capture</a></dt>
+        <dd>Comparison between different methods of capturing domain
+          state</dd>
       </dl>
     </div>

diff --git a/docs/domainstatecapture.html.in b/docs/domainstatecapture.html.in
new file mode 100644
index 0000000000..00ab7e8ee1
--- /dev/null
+++ b/docs/domainstatecapture.html.in
@@ -0,0 +1,190 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE html>
+<html xmlns="http://www.w3.org/1999/xhtml">
+  <body>
+
+    <h1>Domain state capture using Libvirt</h1>
+
+    <ul id="toc"></ul>
+
+    <p>
+      This page compares the different means for capturing state
+      related to a domain managed by libvirt, in order to aid
+      application developers to choose which operations best suit
+      their needs.
+    </p>
+
+    <h2><a id="definitions">State capture trade-offs</a></h2>
+
+    <p>One of the features made possible with virtual machines is live
+      migration, or transferring all state related to the guest from
+      one host to another, with minimal interruption to the guest's
+      activity.  A clever observer will then note that if all state is
+      available for live migration, there is nothing stopping a user
+      from saving that state at a given point of time, to be able to
+      later rewind guest execution back to the state it previously
+      had.  There are several different libvirt APIs associated with
+      capturing the state of a guest, such that the captured state can
+      later be used to rewind that guest to the conditions it was in
+      earlier.  But since there are multiple APIs, it is best to
+      understand the tradeoffs and differences between them, in order
+      to choose the best API for a given task.
+    </p>
+
+    <dl>
+      <dt>Timing</dt>
+      <dd>Capturing state can be a lengthy process, so while the
+        captured state ideally represents an atomic point in time
+        correpsonding to something the guest was actually executing,
+        some interfaces require up-front preparation (the state
+        captured is not complete until the API ends, which may be some
+        time after the command was first started), while other
+        interfaces track the state when the command was first issued
+        even if it takes some time to finish capturing the state.
+        While it is possible to freeze guest I/O around either point
+        in time (so that the captured state is fully consistent,
+        rather than just crash-consistent), knowing whether the state
+        is captured at the start or end of the command may determine
+        which approach to use.  A related concept is the amount of
+        downtime the guest will experience during the capture,
+        particularly since freezing guest I/O has time
+        constraints.</dd>
+
+      <dt>Amount of state</dt>
+      <dd>For an offline guest, only the contents of the guest disks
+        needs to be captured; restoring that state is merely a fresh
+        boot with the disks restored to that state.  But for an online
+        guest, there is a choice between storing the guest's memory
+        (all that is needed during live migration where the storage is
+        shared between source and destination), the guest's disk state
+        (all that is needed if there are no pending guest I/O
+        transactions that would be lost without the corresponding
+        memory state), or both together.  Unless guest I/O is quiesced
+        prior to capturing state, then reverting to captured disk
+        state of a live guest without the corresponding memory state
+        is comparable to booting a machine that previously lost power
+        without a clean shutdown; but for a guest that uses
+        appropriate journaling methods, this crash-consistent state
+        may be sufficient to avoid the additional storage and time
+        needed to capture memory state.</dd>
+
+      <dt>Quantity of files</dt>
+      <dd>When capturing state, some approaches store all state within
+        the same file (internal), while others expand a chain of
+        related files that must be used together (external), for more
+        files that a management application must track.  There are
+        also differences depending on whether the state is captured in
+        the same file in use by a running guest, or whether the state
+        is captured to a distinct file without impacting the files
+        used to run the guest.</dd>
+
+      <dt>Third-party integration</dt>
+      <dd>When capturing state, particularly for a running, there are
+        tradeoffs to how much of the process must be done directly by
+        the hypervisor, and how much can be off-loaded to third-party
+        software.  Since capturing state is not instantaneous, it is
+        essential that any third-party integration see consistent data
+        even if the running guest continues to modify that data after
+        the point in time of the capture.</dd>
+
+      <dt>Full vs. partial</dt>
+      <dd>When capturing state, it is useful to minimize the amount of
+        state that must be captured in relation to a previous capture,
+        by focusing only on the portions of the disk that the guest
+        has modified since the previous capture.  Some approaches are
+        able to take advantage of checkpoints to provide an
+        incremental backup, while others are only capable of a full
+        backup including portions of the disk that have not changed
+        since the previous state capture.</dd>
+    </dl>
+
+    <h2><a id="apis">State capture APIs</a></h2>
+    <p>With those definitions, the following libvirt APIs have these
+      properties:</p>
+    <dl>
+      <dt>virDomainSnapshotCreateXML()</dt>
+      <dd>This API wraps several approaches for capturing guest state,
+        with a general premise of creating a snapshot (where the
+        current guest resources are frozen in time and a new wrapper
+        layer is opened for tracking subsequent guest changes).  It
+        can operate on both offline and running guests, can choose
+        whether to capture the state of memory, disk, or both when
+        used on a running guest, and can choose between internal and
+        external storage for captured state.  However, it is geared
+        towards post-event captures (when capturing both memory and
+        disk state, the disk state is not captured until all memory
+        state has been collected first).  For qemu as the hypervisor,
+        internal snapshots currently have lengthy downtime that is
+        incompatible with freezing guest I/O, but external snapshots
+        are quick.  Since creating an external snapshot changes which
+        disk image resource is in use by the guest, this API can be
+        coupled with <code>virDomainBlockCommit()</code> to restore
+        things back to the guest using its original disk image, where
+        a third-party tool can read the backing file prior to the live
+        commit.  See also the <a href="formatsnapshot.html">XML
+        details</a> used with this command.</dd>
+      <dt>virDomainBlockCopy()</dt>
+      <dd>This API wraps approaches for capturing the state of disks
+        of a running guest, but does not track accompanying guest
+        memory state, and can only operate on one block device per job
+        (to get a consistent copy of multiple disks, the domain must
+        be paused before ending the multiple jobs).  The capture is
+        consistent only at the end of the operation, with a choice to
+        either pivot to the new file that contains the copy (leaving
+        the old file as the backup), or to return to the original file
+        (leaving the new file as the backup).</dd>
+      <dt>virDomainBackupBegin()</dt>
+      <dd>This API wraps approaches for capturing the state of disks
+        of a running guest, but does not track accompanying guest
+        memory state.  The capture is consistent to the start of the
+        operation, where the captured state is stored independently
+        from the disk image in use with the guest, and where it can be
+        easily integrated with a third-party for capturing the disk
+        state.  Since the backup operation is stored externally from
+        the guest resources, there is no need to commit data back in
+        at the completion of the operation.  When coupled with
+        checkpoints, this can be used to capture incremental backups
+        instead of full.</dd>
+      <dt>virDomainCheckpointCreateXML()</dt>
+      <dd>This API does not actually capture guest state, so much as
+        make it possible to track which portions of guest disks have
+        change between checkpoints or between a current checkpoint and
+        the live execution of the guest.  When performing incremental
+        backups, it is easier to create a new checkpoint at the same
+        time as a new backup, so that the next incremental backup can
+        refer to the incremental state since the checkpoint created
+        during the current backup.  Guest state is then actually
+        captured using <code>virDomainBackupBegin()</code>.  <!--See also
+        the <a href="formatcheckpoint.html">XML details</a> used with
+        this command.--></dd>
+    </dl>
+
+    <h2><a id="examples">Examples</a></h2>
+    <p>The following two sequences both capture the disk state of a
+      running guest, then complete with the guest running on its
+      original disk image; but with a difference that an unexpected
+      interruption during the first mode leaves a temporary wrapper
+      file that must be accounted for, while interruption of the
+      second mode has no impact to the guest.</p>
+    <p>1. Backup via temporary snapshot
+      <pre>
+virDomainFSFreeze()
+virDomainSnapshotCreateXML(VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY)
+virDomainFSThaw()
+third-party copy the backing file to backup storage # most time spent here
+virDomainBlockCommit(VIR_DOMAIN_BLOCK_COMMIT_ACTIVE) per disk
+wait for commit ready event per disk
+virDomainBlockJobAbort() per disk
+      </pre></p>
+
+    <p>2. Direct backup
+      <pre>
+virDomainFSFreeze()
+virDomainBackupBegin()
+virDomainFSThaw()
+wait for push mode event, or pull data over NBD # most time spent here
+virDomainBackeupEnd()
+    </pre></p>
+
+  </body>
+</html>
diff --git a/docs/formatsnapshot.html.in b/docs/formatsnapshot.html.in
index f2e51df5ab..d7051683a5 100644
--- a/docs/formatsnapshot.html.in
+++ b/docs/formatsnapshot.html.in
@@ -9,6 +9,8 @@
     <h2><a id="SnapshotAttributes">Snapshot XML</a></h2>

     <p>
+      Snapshots are one form
+      of <a href="domainstatecapture.html">domain state capture</a>.
       There are several types of snapshots:
     </p>
     <dl>
-- 
2.14.4




More information about the libvir-list mailing list