[libvirt] Sparse image volDownload

Daniel P. Berrange berrange at redhat.com
Tue Dec 8 09:31:50 UTC 2015


On Tue, Dec 08, 2015 at 10:13:31AM +0100, Michal Privoznik wrote:
> On 07.12.2015 20:25, Vasiliy Tolstov wrote:
> > 07 дек. 2015 г. 18:13 пользователь "Daniel P. Berrange" <berrange at redhat.com>
> > написал:
> >>
> >> On Mon, Dec 07, 2015 at 04:04:40PM +0100, Michal Privoznik wrote:
> >>> On 07.12.2015 14:51, Daniel P. Berrange wrote:
> >>>> On Mon, Dec 07, 2015 at 02:46:59PM +0100, Michal Privoznik wrote:
> >>>>> Dear list,
> >>>>>
> >>>>> I'd like to hear your opinion on the following bug:
> >>>>>
> >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1282859
> >>>>>
> >>>>> Long story short, Imagine the following scenario:
> >>>>>
> >>>>> 1. Create 4GB file full of zeroes
> >>>>> 2. virsh vol-download it
> >>>>>
> >>>>> What happens is that all those 4GB are transferred byte after byte
> >>>>> through our RPC system. Not only this puts needles pressure on our
> > event
> >>>>> loop, it's suboptimal for network and other resources too.
> >>>>>
> >>>>> I'd like to explore our options here keeping in mind that the
> > original
> >>>>> volume might have been sparse and we ought to keep it sparse on the
> >>>>> destination too.
> >>>>>
> >>>>> In the bug the reporter (Matthew Booth) suggests introducing new
> > type of
> >>>>> RPC message that will let us keep our APIs unchanged. The source will
> >>>>> scan the file for windows of zeroes bigger than some value. When
> > found
> >>>>> the new type of message is passed to client without need to copy
> > those
> >>>>> zeroes. Yes, this is very similar to RLE.
> >>>>>
> >>>>> If we are going that way, should we enable users to put a compression
> >>>>> program in between read()/write() and our RPC? Well, should we let
> > users
> >>>>> to choose what compression program we will put there? Because there
> > are
> >>>>> better compression algorithms than RLE.
> >>>>
> >>>> It only looks like compression if you're solely looking at the network
> >>>> data transfer. A keep feature of sparse support is that we preserve
> >>>> the sparseness on both sides.
> >>>>
> >>>> ie, if I have a sparse raw file locally, and vol-upload it, it should
> >>>> remain a sparse file on the server. Likewise vol-downloading a sparse
> >>>> file should let me create a sparse file locally.  For this reason the
> >>>> RPC program must explicitly represent data holes, and not merely
> >>>> consider them a type of compression algorithm, as that would not let
> >>>> us preserve the holes on both ends of the stream.
> >>>
> >>> Right. But how could we apply both our RLE algorithm and an external
> >>> program on the same stream? Should we multiplex and send holes to the
> >>> other side as they are and run the rest through the external compression
> >>> program? Otherwise I don't see how we could preserve sparseness.
> >>
> >> I think we should just focus on sending holes in the RPC protocol
> >> right now, and not try todo compression at the same time, as we need
> >> to be able to represent holes in the protocol regardless of whether
> >> compression is present.
> >>
> > 
> > Sometimes ago I'm already ask about this and to add compress flag to vol
> > upload and download (don't have time to complete).
> > For my use case best way is to able to create compressed stream that goes
> > to libvirt. So in this case we effectively solve sparse file problem and
> > also can transfer less data, all my tests with lz4 compression says that I
> > get is about 20% minimum benefit compared to original volume size.
> 
> Right. And as Dan pointed out, these two approaches are orthogonal to
> each other. Compressing a stream of data to reduce size is a nice
> feature to have, preserving sparseness of a file is something different
> though (although the way I'm intending to implement it will reduce data
> sent through virStream too).
> 
> One thing that I am still wondering about is sparseness detection.
> Finding a window full of zeroes in a file does not necessarily mean that
> those come from read() over segment that's not on disk. We surely can
> have a raw file that is sparse and also contains a window full of
> zeroes. But I guess it's okay if we sparsify (if that's even a verb)
> file even more on volDownload or volUpload.

There is an ioctl you can use to detect actual holes in recentish
Linux. You only need to fallback to scanning for zeros if that is
not available


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list