[libvirt] [RFC] Proposed API to support block device streaming

Adam Litke agl at us.ibm.com
Tue Nov 9 21:17:23 UTC 2010


I've been working with Anthony Liguori and Stefan Hajnoczi to enable data
streaming to copy-on-read disk images in qemu.  This work is working its way
through peer review and I expect it to be upstream soon as part of the support
for the new QED disk image format.

I would like to enable these commands in libvirt in order to support at least
two compelling use cases:

1) Rapid deployment of domains:
Creating a new domain from a central repository of images can be time consuming
since a local copy of the image must be made before the domain can be started.
With copy-on-read and streaming, up-front copy time is eliminated and the
domain can be started immediately.  Streaming can run while the domain runs
to fully populate the disk image.

2) Post-copy live block migration:
A qemu-nbd server is started on the source host and serves the domain's block
device to the destination host.  A QED image is created on the destination host
with backing to the nbd server.  The domain is migrated as normal.  When
migration completes, a stream command is executed to fully populate the
destination QED image.  After streaming completes, the qemu-nbd server can
be shut down and the domain (including local storage) is fully independent of
the source host.

Qemu will support two streaming modes: full device and single sector.  Full
device streaming is the easiest to use because one command will cause the whole
device to be streamed as fast as possible.  Single sector mode can be used if
one wants to throttle streaming to reduce I/O pressure.  In this mode, the user
issues individual commands to stream single sectors.

To enable this support in libvirt, I propose the following API...

virDomainStreamDisk() initiates either a full device stream or a single sector
stream (depending on virDomainStreamDiskFlags).  For a full device stream, it
returns either 0 or -1.  For a single sector stream, it returns an offset that
can be used to continue streaming with a subsequent call to virDomainStreamDisk().

virDomainStreamDiskInfo() returns the status of a currently-running full device
stream (the device name, current streaming position, and total size).

Comments on this design would be greatly appreciated.  Thanks!

diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in
index 81db3a2..d80a8b5 100644
--- a/include/libvirt/libvirt.h.in
+++ b/include/libvirt/libvirt.h.in
@@ -1046,6 +1046,39 @@ int virDomainUpdateDeviceFlags(virDomainPtr domain,
                                const char *xml, unsigned int flags);
 
 /*
+ * Disk Streaming
+ */
+typedef enum {
+    VIR_STREAM_DISK_FULL = 1, /* Stream the entire disk */
+    VIR_STREAM_DISK_ONE  = 2, /* Stream a single disk unit */
+} virDomainStreamDiskFlags;
+
+#define VIR_STREAM_PATH_BUFLEN 100
+#define VIR_STREAM_DISK_MAX_STREAMS 10
+
+typedef struct _virStreamDiskState virStreamDiskState;
+struct _virStreamDiskState {
+    char path[VIR_STREAM_PATH_BUFLEN];
+    /*
+     * The unit of measure for size and offset is unspecified.  These fields
+     * are meant to indicate the progress of a continuous streaming operation.
+     */
+    unsigned long long offset; /* Current offset of active streaming */
+    unsigned long long size;   /* Disk size */
+};
+typedef virStreamDiskState *virStreamDiskStatePtr;
+
+unsigned long long       virDomainStreamDisk(virDomainPtr dom,
+                                             const char *path,
+                                             unsigned long long offset,
+                                             unsigned int flags);
+
+int                      virDomainStreamDiskInfo(virDomainPtr dom,
+                                                 virStreamDiskStatePtr infos,
+                                                 unsigned int nr_infos,
+                                                 unsigned int flags);
+
+/*
  * NUMA support
  */
 

-- 
Thanks,
Adam




More information about the libvir-list mailing list