[libvirt] RFC: API additions for enhanced snapshot support

Tue Jul 5 21:36:39 UTC 2011

On 07/04/2011 07:53 PM, Jagane Sundar wrote:
> Thanks for looping me in, Stefan.
> 
>> Does each volume have its own independent snapshot namespace?  It may
>> be wise to document that snapshot namespaces are *not* independent
>> because storage backends may not be able to provide these semantics.
>>
> There is a need for 'just-a-volume-snapshot', and for a
> 'whole-vm-snapshot'.
> The 'whole-vm-snapshot' can possibly be collection of
> 'just-a-volume-snapshot'.

In the case of the current libvirt API, which makes use of the qemu
'savevm' monitor command (and thus is more of a checkpoint, rather than
just disk snapshots):

'savevm' fails unless all disks associated with the guest are already
qcow2 format.  Additionally, it creates a snapshot visible to 'qemu-img
snapshot -l' in all associated disks, but where only the primary disk
additionally has the state of RAM also in the image.

As for the creating a snapshot commands - the proposal for
virStorageVolSnapshotCreateXML is _solely_ for offline management of
storage volumes.  If a storage volume is in use by a running qemu
domain, then the only appropriate way to take an online snapshot of that
disk (short of stopping the domain to get to the offline snapshot case)
is to use the existing virDomainSnapshotCreateXML API instead.  And that
API is already flexible enough to support 'whole-vm-snapshot' vs.
'just-a-volume-snapshot'.

The existing virDomainSnapshotCreateXML API is currently mapped to the
'savevm' command (which takes a checkpoint, which is the
whole-vm-snapshot + memory), but can easily be modified to take just
disk snapshots, and I already mentioned doing that by modifying the XML
and adding a flag.  That is, on creation:

virDomainSnapshotCreateXML(domain, "
<domainsnapshot>
  <name>whatever</name>
</domainsnapshot>
", 0)

is the existing usage, which creates a checkpoint (all qcow2 images get
an internal snapshot named "whatever", and the first image also saves
the memory state).

virDomainSnapshotCreateXML(domain, "
<domainsnapshot>
  <name>whatever</name>
</domainsnapshot>
", VIR_DOMAIN_SNAPSHOT_DISK_ONLY)

would try to create a snapshot of all disks associated with the image
(although without the name of the snapshot file, either libvirt will
have to have some default smarts for how to generate a reasonable backup
file name, or this will fail).  That is, omit all mention of <disk>
subelements, and libvirt will then fill out the XML to cover all disks
(the whole-vm-snapshot case).

virDomainSnapshotCreateXML(domain, "
<domainsnapshot>
  <name>whatever</name>
  <disk name='/path/to/image1' snapshot='no'/>
  <disk name='/path/to/image2'>
    <volsnapshot>...</volsnapshot>
  </disk>
</domainsnapshot>
", VIR_DOMAIN_SNAPSHOT_DISK_ONLY)

will only do a snapshot of disk image 2, using the <volsnapshot>
information to explicitly specify the filename to use on the created
external snapshot file (rather than letting libvirt generate the
snapshot name).  That is, provide at least one <disk> element, and you
now have fine-grained control over which volumes get a snapshot (or even
how that snapshot is created).

> There are two types of snapshots that I am aware of:
> - Base file is left unmodified after snapshot, snapshot file is created
> and modified. e.g. qcow2 (I think)
> - Base file continues to be modified. The snapshot file gets COW blocks
> copied into it. e.g. LVM, Livebackup, etc.

There's a third - the qcow2 internal snapshot:

- Base file contains both the snapshot and the delta.

> 
> Can we enhance the libvirt API to indicate what type of snapshot is
> desired. Also, when a snapshot is listed, can we try and describe it as
> one kind or the other?

Yes, there are already some read-only XML elements in the
<domainsnapshot> XML (that is, libvirt ignores or rejects them if you
pass them to virDomainSnapshotCreateXML, but virDomainSnapshotGetXMLDesc
will list those additional elements to give you more details about the
snapshot); having a sub-element to state whether the snapshot is
backing-file based (original is now treated as read-only, and
modifications affect the snapshot) or COW based (original and backup
share all blocks to begin with, but as original get modified, the
read-only backup has more unique blocks).

> 
> There is no facility in the API to track dirty bitmaps. Suppose a disk
> format or qemu proper has the ability to maintain a dirty bitmap of
> blocks(or clusters) modified since some event (time in ms, perhaps). I
> would like libvirt to provide a function such as:
> 
> /*
> * returns NULL if the underlying block driver does not support
> * maintaining a dirty bitmap. If it does support a dirty bitmap,
> * the driver returns an opaque object that represents the time
> * since which this dirty bitmap is valid.
> *
> * Used by incremental backup programs to determine if qemu
> * has a bitmap of blocks that were dirtied since the last time
> * a backup was taken.
> */
> virStorageDirtyBitmapTimeOpaquePtr
> virStorageVolDirtyBitmapPresent(virStorageVolPtr vol)

Yes, we already had a discussion about the utility of being able to
expose how much of an image is directly contained within a file, vs.
being pulled in from a backing file (which can also be read as how much
of an image is dirty compared to the state of a snapshot).  See Daniel's
earlier thoughts:
https://www.redhat.com/archives/libvir-list/2011-April/msg00555.html

-- 
Eric Blake   eblake at redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 619 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20110705/22b0b6be/attachment-0001.sig>