[libvirt] [PATCH 00/27] Introduce sparse streams

Daniel P. Berrange berrange at redhat.com
Thu May 5 16:10:50 UTC 2016

On Thu, Apr 28, 2016 at 12:04:47PM +0200, Michal Privoznik wrote:
> So, after couple of sleepless nights and headaches I'm proud to
> announce that finally got this working.
> What?
> Our regular streams that are can be used to transfer disk images
> for domains are unaware of any sparseness. Therefore they have
> two limitations:
> a) transferring big but sparse image can take ages as all the
> holes (interpreted by kernel as '\0') have to go through our
> event loop.
> b) resulting volume is not sparse even if the source was.
> How?
> I went by verified approach that linux kernel has. One way to
> look at our streams is just like read() and write() with a
> different names: virStreamRecv() and virStreamSend(). They even
> have the same prototype (if 'int fd' is substituted with
> 'virStreamPtr'). Now, holes in files are created and detected via
> third API: lseek(). Therefore I'm introducing new virStreamSkip()
> API that mimics the missing primitive. Now, because our streams
> do not necessarily have to work over files (they are for generic
> data transfer), I had to let users register a callback that is
> called whenever the other side calls virStreamSkip().
> So now that we have all three primitives, we can focus on making
> life easier for our users. Nobody is actually using bare
> virStreamSend() and virStreamRecv() rather than our wrappers:
> virStreamSendAll() and virStreamRecvAll(). With my approach
> described above just virStreamSendAll() needs to be adjusted so
> that it's 'sparse file' aware. The virStreamRecvAll() will only
> get the data to write (just like it is now) with skip callback
> called automatically whenever needed. In order for
> virStreamSendAll() to skip holes I'm introducing yet another
> callback: virStreamInDataFunc(). This callback will help us to
> create a map of a file: before each virStreamSend() it checks
> whether we are in a data section or a hole and calls
> virStreamSend() or virStreamSkip() respectively.
> Do not worry - it will all become clear once you see the code.
> Now question is - how will users enable this feature? I mean, we
> have take into account that we might be talking to an older
> daemon that does not know how to skip a hole. Or vice versa -
> older client.
> The solution I came up with is to introduce flags to APIs where
> sparse streams make sense. I guess it makes sense for volume
> upload and download, but almost certainly makes no sense for
> virDomainOpenConsole().
> Code?
> >From users POV they just need to pass correct argument to
> 'vol-upload' or 'vol-download' virsh commands. One layer down, on
> programming level they need to merely:
> st = virStreamNew(conn, 0);
> virStreamRegisterSkip(st, skipFunc, &fd);
> virStorageVolDownload(st, ...);
> virStreamRecvAll(st, sinkFunc, &fd);
> where:
> int skipFunc(virStreamPtr st,
>              unsigned long long offset, void *opaque)
> {
>     int *fd = opaque;
>     return lseek(*fd, offset, SEEK_CUR) == (off_t) -1 ? -1 : 0;
> }
> And for uploading it's slightly more verbose - see patch 24.

While it looks ok-ish in the context of the RecvAll function,
because you have skipFunc + recvFunc both being invoked
asynchronously, this design feels a quite odd when used
in combination with the plain virStreamRecv().

Just for a minute lets consider

 st = virStreamNew(conn, 0);
 virStreamRegisterSkip(st, skipFunc, &fd);
 virStorageVolDownload(st, ...);

 while (1) {
    char buf[4096];
    virStreamRecv(st, buf, sizeof(buf));

    ..do something with buf..

I think it is quite unpleasant to have the skipFunc be called
out of band to the code dealing with the Recv.

I think it is preferrable that we have an explicit synchronous
API for consuming the hole.

The same really applies from POV of the upload side - I think
we should be able to synchronously push the hole in, rather
than having the asynchronous InData callback.

IIUC, the reason you've chosen this async callback approach
is because the virStream{Send,Recv,SendAll,RecvAll} methods
are not extensible to cope with holes.

I'm thinking though that we'd get a more attractive public API
by creating  hole friendly versions of virStream{Send,Recv,etc}
instead of going the extra callback route.

The download side could we made to work if we had a flag for
virStreamRecv that instructed it to only read data up to the
next hole, and defined an explicit error code to use to indicate
that we've hit a hole.

This would be used thus:

 while (1) {
    char buf[4096];

    int ret = virStreamRecvFlags(st, buf, len, VIR_STREAM_STOP_AT_HOLE);
    if (ret == -1) {
        virErrorPtr err = virGetLastError();
	if (err.code == VIR_ERR_STREAM_HOLE) {
	   ret = virStreamHoleSize(st, buf);
	   ...seek ret bytes in target...
	} else {
	   return -1;
    } else {
        ...write buf to target...

To make this work with virStreamRecvAll, we would need to add
a virStreamSparseRecvAll() which had two callbacks eg

  typedef int (*virStreamSinkHoleFunc)(virStreamPtr st, unsigned long long len):

  virStreamSparseRecvAll(virStreamPtr stream,
                         virStreamSinkFunc handler,
                         virStreamSinkHoleFunc holeHandler,
                         void *opaque) {
    while (1) {
      char buf[4096];

      int ret = virStreamRecvFlags(st, buf, len, VIR_STREAM_STOP_AT_HOLE);
      if (ret == -1) {
        virErrorPtr err = virGetLastError();
	if (err.code == VIR_ERR_STREAM_HOLE) {
	   ret = virStreamHoleSize(st, buf);
	   holeHandler(st, ret);
	} else {
	   return -1;
      } else {
         handler(st, buf, ret);

Now considering the upload side of things, the virStreamSend
function doesn't actually need cahnges, though it could do
with a flags parameter for best practice. We just need the
virStreamSkip() function you already add.

 while (1) {
    char buf[4096];
    if (..in hole...) {
       ..get hole size...
       virStreamSkip(st, len);
    } else {
       ...read N bytes...
       virStreamSend(st, buf, len);

The SendAll method is again more complicated as the callback we pass
into it is insufficient to figure out if we have holes. We could
add a virStreamSparseSendAll() which had two callbacks again:

 typedef int (*virStreamSourceHoleFunc)(holeHandler);

This returns the length of the current hole, or 0 if not at
a hole.

 virStreamSparseSendAll(virStreamPtr stream,
                        virStreamSourceFunc handler,
                        virStreamSourceHoleFunc holeHandler,
                        void *opaque)
   while (1) {
      char buf[4096];
      int ret = holeHandler(st);
      if (ret > 0) {
         virStreamSkip(st, ret);
      } else {
         ret = handler(st, buf, sizeof(buf);
         virStreamSend(st, buf, ret);

So we would avoid the  virStreamInData and virStreamRegisterInData
and virStreamRegisterSkip and virStreamSkipCallback methods

|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

More information about the libvir-list mailing list