[libvirt] [PATCH 1/6] Add new API virDomainStreamDisk[Info] to header and drivers

Daniel P. Berrange berrange at redhat.com
Mon May 9 16:09:14 UTC 2011


On Mon, May 02, 2011 at 04:29:49PM -0500, Adam Litke wrote:
> After several long distractions, I am back to working on disk streaming.
> Before I hit the list with a new series of patches, I was hoping to
> reach some middle ground on the proposed streaming API.
> 
> On Fri, 2011-04-08 at 14:31 +0100, Daniel P. Berrange wrote:
> > On Thu, Apr 07, 2011 at 04:31:59PM -0500, Adam Litke wrote:
> > >  /*
> > > + * Disk Streaming
> > > + */
> > > +typedef enum {
> > > +    VIR_STREAM_DISK_ONE   = 1,  /* Stream a single disk unit */
> > > +    VIR_STREAM_DISK_START = 2,  /* Stream the entire disk */
> > > +    VIR_STREAM_DISK_STOP  = 4,  /* Stop streaming a disk */
> > > +} virDomainStreamDiskFlags;
> > 
> > Using flags to combine two separate tasks into one single API
> > is rather unpleasant. As raised in the previous patch, the API
> > should also be taking a offset+length in bytes, then there is
> > no need for a special case transfer of an individual sector.
> 
> This is a fair point.  I will work with Stefan to support an
> offset/length qemu API.  Since there doesn't seem to be a great way to
> query device sizes, I think we do need a convenient way to request that
> the entire disk be streamed.  We could either do this with a flag or by
> overriding (offset==0 && length==0) to mean stream the entire device.
> Do you have a preference?

Since length==0 is otherwise meaningless, it is fine to  use that
to indicate "until end of disk". This is consistent with
what we do for virStorageVolUpload/Download which allow length=0
to indicate "until end of disk".

> > > +#define VIR_STREAM_PATH_BUFLEN 1024
> > > +#define VIR_STREAM_DISK_MAX_STREAMS 10
> > > +
> > > +typedef struct _virStreamDiskState virStreamDiskState;
> > > +struct _virStreamDiskState {
> > > +    char path[VIR_STREAM_PATH_BUFLEN];
> > > +    /*
> > > +     * The unit of measure for size and offset is unspecified.  These fields
> > > +     * are meant to indicate the progress of a continuous streaming operation.
> > > +     */
> > > +    unsigned long long offset; /* Current offset of active streaming */
> > > +    unsigned long long size;   /* Disk size */
> > > +};
> > > +typedef virStreamDiskState *virStreamDiskStatePtr;
> > > +
> > > +unsigned long long       virDomainStreamDisk(virDomainPtr dom,
> > > +                                             const char *path,
> > > +                                             unsigned long long offset,
> > > +                                             unsigned int flags);
> > > +
> > > +int                      virDomainStreamDiskInfo(virDomainPtr dom,
> > > +                                                 virStreamDiskStatePtr states,
> > > +                                                 unsigned int nr_states,
> > > +                                                 unsigned int flags);
> > 
> > I would have liked it if we could use the existing JobInfo APIs for
> > getting progress information, but if we need to allow concurrent
> > usage for multiple disks per-VM, we can't. I think we should still
> > use a similar style of API though.
> 
> The goal is to eventually support concurrent streams.  Therefore, we
> will probably need to roll our own Status and Abort APIs (see my
> proposed API below).
> 
> > There also doesn't appear to be a precise way to determine if the
> > copying of an entire disk failed part way through, and if so, how
> > much was actually copied.
> > 
> > Taking all the previous points together, I think the API needs to
> > look like this:
> > 
> >     typedef enum {
> >         /* If set, virDomainBlockAllocate() will return immediately
> >          * allowing polling for operation completion & status
> >          */
> >         VIR_DOMAIN_DISK_ALLOCATE_NONBLOCK,
> >     } virDomainBlockAllocateFlags;
> > 
> >     /*
> >      * @path: fully qualified filename of the virtual disk
> 
> I probably misnamed it, but path is actually the device alias, not a
> path to an image file.

Hmm, I wonder whether that is a good choice or not. The other
APIs all use the disk path. Perhaps we could use that as default
and have a flag to indicate whether the path should be treated as
a device alias instead. Thus getting both options.

> 
> >      * @offset: logical position in bytes, within the virtual disk
> >      * @length: amount of data to copy, in bytes starting from @offset
> >      * @flags: One of virDomainBlockAllocateFlags, or zero
> >      *
> >      * Ensure the virtual disk @path is fully allocated
> >      * between @offset and @offset+ at length. If a backing
> >      * store is present, data will be filled from the
> >      * backing store, otherwise data will be fileld with
> >      * zeros
> >      *
> >      * If @flags contains VIR_DOMAIN_DISK_ALLOCATE_NONBLOCK,
> >      * this API will return immediately after initiating the
> >      * copy, otherwise it will block until copying is complete
> >      *
> >      */
> >     int virDomainBlockAllocate(virDomainPtr dom,
> >                                const char *path,
> >                                unsigned long long offset,
> >                                unsigned long long length,
> >                                unsigned int flags);
> > 
> > 
> >     /*
> >      * @path: fully qualified filename of the virtual disk
> >      * @info: allocated struct to return progress info
> >      *
> >      * Query the progress of a disk allocation job. This
> >      * API must be used when virDomainBlockAllocate() was
> >      * invoked with the VIR_DOMAIN_DISK_ALLOCATE_NONBLOCK
> >      * flag set.
> >      *
> >      * The @info.type field will indicate whether the job
> >      * was completed successfully, or failed part way
> >      * through.
> >      *
> >      * The @info data progress fields will contain current
> >      * progress information.
> >      *
> >      * The hypervisor driver may optionally chose to also
> >      * fillin a time estimate for completion.
> >      */
> >     int virDomainBlockGetJobInfo(virDomainPtr dom,
> >                                  const char *path,
> >                                  virDomainJobInfoPtr info);
> > 
> >     /*
> >      * @path: fully qualified filename of the virtual disk
> >      *
> >      * Request that a disk allocation job be aborted at
> >      * the soonest opportunity. This API can only be used
> >      * when virDomainBlockAllocate() was invoked with the
> >      * VIR_DOMAIN_DISK_ALLOCATE_NONBLOCK flag set.
> >      */
> >     int virDomainBlockAbortJob(virDomainPtr dom,
> >                                const char *path);
> > 
> > 
> >     typedef struct _virDomainBlockRegion virDomainBlockRegion;
> >     typedef virDomainBlockRegion *virDomainBlockRegionPtr;
> >     struct _virDomainBlockRegion {
> >         /* The logical offset within the file of the allocated region */
> >         unsigned long long offset;
> >         /* The length of the allocated region */
> >         unsigned long long length;
> >     };
> > 
> >     /*
> >      * @path: fully qualified filename of the virtual disk
> >      * @nregions: filled in the number of @region structs
> >      * @regions: filled with a list of allocated regions
> >      *
> >      * Query the extents of allocated regions within the
> >      * virtual disk file. The offsets in the list of regions
> >      * are not guarenteed to be sorted in any explicit order.
> >      */
> >     int virDomainBlockGetAllocationMap(virDomainPtr dom,
> >                                        const char *path,
> >                                        unsigned int *nregions,
> >                                        virDomainBlockRegionPtr *regions);
> 
> I am not convinced that we really need the block allocation map stuff.
> It's a fun toy, but the layout of data within a device is far too
> low-level of a concept to be exposing to users.  In my opinion,
> streaming should really be done sequentially from the start of the
> device to the end (with arbitrary chunk sizes allowed for throttling
> purposes).  If an application wants to stream according to some special
> pattern, let them maintain a data structure outside of libvirt to manage
> that extra complexity.  It's not an error to stream a chunk that is
> already present.  The populated areas will just complete immediately.
> Therefore, there isn't a need to maintain a strict record of outstanding
> regions.

Well we can always add something like this in at a later
date, so it is fine to drop it.

> > This takes care of things for running guests. It would be
> > desirable to have the same functionality available when a
> > guest is not running, via the virStorageVol APIs. Indeed,
> > this would allow access to the allocation functionality
> > for disks not explicitly associated with any VM yet.
> 
> In light of previous discussion about the complexities around storage
> formats and filesystems, I am not sure how useful such an offline API is
> going to be in practice.  At any rate, I would prefer to consider that
> issue separately.
> 
> So, with my points taken into account I would like to counter with the
> following API proposal:
> 
> ==== snip ====
> 
>     typedef enum {
>          /* If set, virDomainBlockAllocate() will return immediately
>           * allowing polling for operation completion & status
>           */
>          VIR_DOMAIN_DISK_STREAM_NONBLOCK,
>      } virDomainBlockStreamFlags;
> 
>      /*
>       * @device: alias of the target block device
>       * @offset: logical position in bytes, within the virtual disk
>       * @length: amount of data to copy, in bytes starting from @offset
>       * @flags: One of virDomainBlockAllocateFlags, or zero
>       *
>       * Ensure the virtual disk @device is fully allocated
>       * between @offset and @offset+ at length. If a backing
>       * store is present, data will be filled from the
>       * backing store, otherwise data will be fileld with
>       * zeros.
>       *
>       * If @flags contains VIR_DOMAIN_DISK_STREAM_NONBLOCK,
>       * this API will return immediately after initiating the
>       * copy, otherwise it will block until copying is complete
>       *
>       */
>      int virDomainBlockStream(virDomainPtr dom,
>                               const char *device,
>                               unsigned long long offset,
>                               unsigned long long length,
>                               unsigned int flags);
> 
>      /*
>       * @device: alias of the target block device
>       *
>       * Request that a disk streaming job be aborted at
>       * the soonest opportunity. This API can only be used
>       * when virDomainBlockStream() was invoked with the
>       * VIR_DOMAIN_DISK_STREAM_NONBLOCK flag set.
>       */
>      int virDomainBlockStreamAbort(virDomainPtr dom,
>                                    const char *device);
> 
>      typedef enum {
>          VIR_BLOCK_STREAM_STATE_COMPLETED = 0,
>          VIR_BLOCK_STREAM_STATE_ACTIVE = 1,
>          VIR_BLOCK_STREAM_STATE_FAILED = 2,
>      } virBlockStreamStatus;
> 
>      typedef struct _virBlockStreamInfo virBlockStreamInfo;
>      struct _virBlockStreamInfo {
>          virBlockStreamStatus status;

Coding guidelines don't allow use of enum types in structs
or as API parameters, instead requiring 'int'.

>          unsigned long long remaining; /* number of bytes remaining */ 
>      };
>      typedef virBlockStreamInfo *virBlockStreamInfoPtr;

s/virBlock/virDomainBlock/ in several places here.

> 
>      /*
>       * @device: alias of the target block device
>       * @info: allocated struct to return progress info
>       *
>       * Query the progress of a disk streaming job. This
>       * API must be used when virDomainBlockStream() was
>       * invoked with the VIR_DOMAIN_DISK_STREAM_NONBLOCK
>       * flag set.
>       *
>       */
>      int virDomainBlockStreamInfo(virDomainPtr dom,
>                                   const char *device,
>                                   virBlockStreamInfoPtr info);

Any reason you didn't just use the existing 'virDomainJobInfoPtr'
struct, with this new API ? In particular I think we want many
of the other fields from that struct beyond just 'remaining'.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list