[libvirt] [PATCH 00/27] Introduce sparse streams

Michal Privoznik mprivozn at redhat.com
Thu Apr 28 10:04:47 UTC 2016


So, after couple of sleepless nights and headaches I'm proud to
announce that finally got this working.


What?
Our regular streams that are can be used to transfer disk images
for domains are unaware of any sparseness. Therefore they have
two limitations:

a) transferring big but sparse image can take ages as all the
holes (interpreted by kernel as '\0') have to go through our
event loop.
b) resulting volume is not sparse even if the source was.


How?
I went by verified approach that linux kernel has. One way to
look at our streams is just like read() and write() with a
different names: virStreamRecv() and virStreamSend(). They even
have the same prototype (if 'int fd' is substituted with
'virStreamPtr'). Now, holes in files are created and detected via
third API: lseek(). Therefore I'm introducing new virStreamSkip()
API that mimics the missing primitive. Now, because our streams
do not necessarily have to work over files (they are for generic
data transfer), I had to let users register a callback that is
called whenever the other side calls virStreamSkip().

So now that we have all three primitives, we can focus on making
life easier for our users. Nobody is actually using bare
virStreamSend() and virStreamRecv() rather than our wrappers:
virStreamSendAll() and virStreamRecvAll(). With my approach
described above just virStreamSendAll() needs to be adjusted so
that it's 'sparse file' aware. The virStreamRecvAll() will only
get the data to write (just like it is now) with skip callback
called automatically whenever needed. In order for
virStreamSendAll() to skip holes I'm introducing yet another
callback: virStreamInDataFunc(). This callback will help us to
create a map of a file: before each virStreamSend() it checks
whether we are in a data section or a hole and calls
virStreamSend() or virStreamSkip() respectively.

Do not worry - it will all become clear once you see the code.

Now question is - how will users enable this feature? I mean, we
have take into account that we might be talking to an older
daemon that does not know how to skip a hole. Or vice versa -
older client.
The solution I came up with is to introduce flags to APIs where
sparse streams make sense. I guess it makes sense for volume
upload and download, but almost certainly makes no sense for
virDomainOpenConsole().


Code?
>From users POV they just need to pass correct argument to
'vol-upload' or 'vol-download' virsh commands. One layer down, on
programming level they need to merely:

st = virStreamNew(conn, 0);
virStreamRegisterSkip(st, skipFunc, &fd);
virStorageVolDownload(st, ...);
virStreamRecvAll(st, sinkFunc, &fd);

where:

int skipFunc(virStreamPtr st,
             unsigned long long offset, void *opaque)
{
    int *fd = opaque;
    return lseek(*fd, offset, SEEK_CUR) == (off_t) -1 ? -1 : 0;
}

And for uploading it's slightly more verbose - see patch 24.


Limitations?
While testing this on my machine with XFS, I've noticed that the
resulting map of a transferred file is not exactly the same as
the source's. Checksums are the same though. After digging deeper
I found this commit in the kernel:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=055388a3188f56676c21e92962fc366ac8b5cb72

Thing is, as we transfer the file, we are practically just
seeking at EOF and thus creating holes. But if the hole size is
small enough, XFS will use some speculative file allocation
algorithm and eventually fully allocate the blocks even if we
intended to create a hole. This does not occur when punching a
hole into a file though. Well, I guess XFS devels have some
reasons to do that.

This behaviour has not been observed on EXT4.


Notes?
Oh, patches 01-03 have been ACKed already, but are not pushed yet
because of the freeze. But since this feature build on the top of
them, I'm sending them too.

Also the whole patch set is accessible at my github:

https://github.com/zippy2/libvirt/tree/sparse_streams4


Michal Privoznik (27):
  Revert "rpc: Fix slow volume download (virsh vol-download)"
  virnetclientstream: Process stream messages later
  virStream{Recv,Send}All: Increase client buffer
  Introduce virStreamSkip
  Introduce virStreamRegisterSkip and virStreamSkipCallback
  Introduce virStreamInData and virStreamRegisterInData
  virNetClientStreamNew: Track origin stream
  Track if stream is seekable
  RPC: Introduce virNetStreamSkip
  Introduce VIR_NET_STREAM_SKIP message type
  Teach wireshark plugin about VIR_NET_STREAM_SKIP
  daemon: Implement VIR_NET_STREAM_SKIP handling
  daemon: Introduce virNetServerProgramSendStreamSkip
  virnetclientstream: Introduce virNetClientStreamSendSkip
  virnetclientstream: Introduce virNetClientStreamHandleSkip
  remote_driver: Implement virStreamSkip
  daemonStreamHandleRead: Wire up seekable stream
  virNetClientStream: Wire up VIR_NET_STREAM_SKIP
  virStreamSendAll: Wire up sparse streams
  fdstream: Implement seek
  gendispatch: Introduce @sparseflag for our calls
  Introduce virStorageVol{Download,Upload}Flags
  virsh: Implement sparse stream to vol-download
  virsh: Implement sparse stream to vol-upload
  fdstream: Suppress use of IO helper for sparse streams
  daemon: Don't call virStreamInData so often
  storage: Enable sparse streams for virStorageVol{Download,Upload}

 daemon/remote.c                      |   2 +-
 daemon/stream.c                      | 134 +++++++++++++--
 daemon/stream.h                      |   3 +-
 include/libvirt/libvirt-storage.h    |   9 +
 include/libvirt/libvirt-stream.h     |  47 ++++++
 src/datatypes.h                      |   8 +
 src/driver-stream.h                  |   5 +
 src/fdstream.c                       | 156 +++++++++++++++---
 src/fdstream.h                       |   3 +-
 src/libvirt-storage.c                |   4 +-
 src/libvirt-stream.c                 | 238 ++++++++++++++++++++++++++-
 src/libvirt_internal.h               |   7 +
 src/libvirt_private.syms             |   2 +
 src/libvirt_public.syms              |   7 +
 src/libvirt_remote.syms              |   2 +
 src/remote/remote_driver.c           |  41 ++++-
 src/remote/remote_protocol.x         |   2 +
 src/rpc/gendispatch.pl               |  21 ++-
 src/rpc/virnetclient.c               |   1 +
 src/rpc/virnetclientstream.c         | 308 ++++++++++++++++++++++++-----------
 src/rpc/virnetclientstream.h         |  10 +-
 src/rpc/virnetprotocol.x             |  16 +-
 src/rpc/virnetserverprogram.c        |  33 ++++
 src/rpc/virnetserverprogram.h        |   7 +
 src/storage/storage_backend.c        |  12 +-
 src/storage/storage_driver.c         |   4 +-
 src/virnetprotocol-structs           |   4 +
 tools/virsh-volume.c                 |  40 ++++-
 tools/virsh.c                        |  79 +++++++++
 tools/virsh.h                        |  12 ++
 tools/virsh.pod                      |   6 +-
 tools/wireshark/src/packet-libvirt.c |  40 +++++
 tools/wireshark/src/packet-libvirt.h |   2 +
 33 files changed, 1104 insertions(+), 161 deletions(-)

-- 
2.8.1




More information about the libvir-list mailing list