[Libguestfs] Readahead in the nbdkit curl plugin

Mon Apr 1 16:55:20 UTC 2019

On Mon, Apr 01, 2019 at 02:09:06PM +0100, Richard W.M. Jones wrote:
> We already have a cache filter so I have two ideas:
> 
> (1) Modify nbdkit-cache-filter to add a readahead parameter.

There's a big problem that I didn't appreciate til now: The cache
filter ends up splitting large reads rather badly.  For example if the
client is issuing 2M reads (not unreasonable for ‘qemu-img convert’)
then the cache filter divides these into 4K requests to the plugin.

Compare:

$ iso='https://download.fedoraproject.org/pub/fedora/linux/releases/29/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-29-1.2.iso'

$ ./nbdkit -U - -fv \
    curl "$iso" \
    --run 'qemu-img convert -f raw -p $nbd /var/tmp/out'
nbdkit: curl[1]: debug: pread count=2097152 offset=0
nbdkit: curl[1]: debug: pread count=2097152 offset=2097152
nbdkit: curl[1]: debug: pread count=2097152 offset=4194304
nbdkit: curl[1]: debug: pread count=2097152 offset=6291456

$ ./nbdkit -U - -fv --filter=cache \
    curl "$iso" \
    --run 'qemu-img convert -f raw -p $nbd /var/tmp/out'
nbdkit: curl[1]: debug: cache: pread count=2097152 offset=0 flags=0x0
nbdkit: curl[1]: debug: cache: blk_read block 0 (offset 0) is not cached
nbdkit: curl[1]: debug: pread count=4096 offset=0
nbdkit: curl[1]: debug: cache: blk_read block 1 (offset 4096) is not cached
nbdkit: curl[1]: debug: pread count=4096 offset=4096
nbdkit: curl[1]: debug: cache: blk_read block 2 (offset 8192) is not cached
nbdkit: curl[1]: debug: pread count=4096 offset=8192
nbdkit: curl[1]: debug: cache: blk_read block 3 (offset 12288) is not cached
nbdkit: curl[1]: debug: pread count=4096 offset=12288
nbdkit: curl[1]: debug: cache: blk_read block 4 (offset 16384) is not cached
nbdkit: curl[1]: debug: pread count=4096 offset=16384

(FWIW we want reads of 64M or larger to get decent performance with
virt-v2v).  Unfortunately the cache filter kills performance dead
because of round-trip times to the web server.

This is a problem with the cache filter that we could likely solve
with a bit of effort, but let's go back and take a look at option
number 2 again:

> (2) Add a new readahead filter which extends all pread requests

When I'm doing v2v / qemu-img convert I don't really need the cache
filter, except it was a convenient place to save the prefetched data.

A dumber readahead filter might help here.  Suppose it simply stores
the position of the last read and prefetches (and saves) a certain
amount of data following that read.  If the next read is sequential,
and so matches the position pointer, return the saved data, otherwise
throw it away and do a normal read.

I believe that this would solve the readahead problem in this case
(but I didn't test it out yet).

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/