[Libguestfs] Some questions about nbdkit vs qemu performance affecting virt-v2v

Martin Kletzander mkletzan at redhat.com
Tue Jul 27 12:18:28 UTC 2021


On Tue, Jul 27, 2021 at 12:16:59PM +0100, Richard W.M. Jones wrote:
>Hi Eric, a couple of questions below about nbdkit performance.
>
>Modular virt-v2v will use disk pipelines everywhere.  The input
>pipeline looks something like this:
>
>  socket <- cow filter <- cache filter <-   nbdkit
>                                           curl|vddk
>
>We found there's a notable slow down in at least one case: When the
>source plugin is very slow (eg. it's curl plugin to a slow and remote
>website, or VDDK in general), everything runs very slowly.
>
>I made a simple test case to demonstrate this:
>
>$ virt-builder fedora-33
>$ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img delay-read=500ms --run 'virt-inspector --format=raw -a "$uri" -vx'
>
>This uses a local file with the delay filter on top injecting half
>second delays into every read.  It "feels" a lot like the slow case we
>were observing.  Virt-v2v also does inspection as a first step when
>converting an image, so using virt-inspector is somewhat realistic.
>
>Unfortunately this actually runs far too slowly for me to wait around
>- at least 30 mins, and probably a lot longer.  This compares to only
>7 seconds if you remove the delay filter.
>
>Reducing the delay to 50ms means at least it finishes in a reasonable time:
>
>$ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img \
>     delay-read=50ms \
>     --run 'virt-inspector --format=raw -a "$uri"'
>
>real    5m16.298s
>user    0m0.509s
>sys     0m2.894s
>
>In the above scenario the cache filter is not actually doing anything
>(since virt-inspector does not write).  Adding cache-on-read=true lets
>us cache the reads, avoiding going through the "slow" plugin in many
>cases, and the result is a lot better:
>
>$ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img \
>     delay-read=50ms cache-on-read=true \
>     --run 'virt-inspector --format=raw -a "$uri"'
>
>real    0m27.731s
>user    0m0.304s
>sys     0m1.771s
>
>However this is still slower than the old method which used qcow2 +
>qemu's copy-on-read.  It's harder to demonstrate this, but I modified
>virt-inspector to use the copy-on-read setting (which it doesn't do
>normally).  On top of nbdkit with 50ms delay and no other filters:
>
>qemu + copy-on-read backed by nbdkit delay-read=50ms file:
>real    0m23.251s
>
>So 23s is the time to beat.  (I believe that with longer delays, the
>gap between qemu and nbdkit increases in favour of qemu.)
>
>Q1: What other ideas could we explore to improve performance?
>

First thing that came to mind: Could it be that QEMU's cache-on-read
caches maybe bigger blocks making it effectively do some small
read-ahead as well?

>- - -
>
>In real scenarios we'll actually want to combine cow + cache, where
>cow is caching writes, and cache is caching reads.
>
>  socket <- cow filter <- cache filter   <-  nbdkit
>                       cache-on-read=true   curl|vddk
>
>The cow filter is necessary to prevent changes being written back to
>the pristine source image.
>
>This is actually surprisingly efficient, making no noticable
>difference in this test:
>
>time ./nbdkit --filter=cow --filter=cache --filter=delay \
>     file /var/tmp/fedora-33.img \
>     delay-read=50ms cache-on-read=true \
>     --run 'virt-inspector --format=raw -a "$uri"'
>
>real	0m27.193s
>user	0m0.283s
>sys	0m1.776s
>
>Q2: Should we consider a "cow-on-read" flag to the cow filter (thus
>removing the need to use the cache filter at all)?
>

That would make at least some sense since there is cow-on-cache already
(albeit a little confusing for me personally).  I presume it would not
increase the size of the difference (when using qemu-img rebase) at all,
right?  I do not see however how it would be faster than the existing:

   cow <- cache[cache-on-read]

Martin

>
>Rich.
>
>-- 
>Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
>Read my programming and virtualization blog: http://rwmj.wordpress.com
>virt-df lists disk usage of guests without needing to install any
>software inside the virtual machine.  Supports Linux and Windows.
>http://people.redhat.com/~rjones/virt-df/
>
>_______________________________________________
>Libguestfs mailing list
>Libguestfs at redhat.com
>https://listman.redhat.com/mailman/listinfo/libguestfs
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20210727/e36a2144/attachment.sig>


More information about the Libguestfs mailing list