[Libguestfs] Some questions about nbdkit vs qemu performance affecting virt-v2v

Tue Jul 27 12:28:17 UTC 2021

On Tue, Jul 27, 2021 at 02:18:28PM +0200, Martin Kletzander wrote:
> On Tue, Jul 27, 2021 at 12:16:59PM +0100, Richard W.M. Jones wrote:
> >Hi Eric, a couple of questions below about nbdkit performance.
> >
> >Modular virt-v2v will use disk pipelines everywhere.  The input
> >pipeline looks something like this:
> >
> > socket <- cow filter <- cache filter <-   nbdkit
> >                                          curl|vddk
> >
> >We found there's a notable slow down in at least one case: When the
> >source plugin is very slow (eg. it's curl plugin to a slow and remote
> >website, or VDDK in general), everything runs very slowly.
> >
> >I made a simple test case to demonstrate this:
> >
> >$ virt-builder fedora-33
> >$ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img delay-read=500ms --run 'virt-inspector --format=raw -a "$uri" -vx'
> >
> >This uses a local file with the delay filter on top injecting half
> >second delays into every read.  It "feels" a lot like the slow case we
> >were observing.  Virt-v2v also does inspection as a first step when
> >converting an image, so using virt-inspector is somewhat realistic.
> >
> >Unfortunately this actually runs far too slowly for me to wait around
> >- at least 30 mins, and probably a lot longer.  This compares to only
> >7 seconds if you remove the delay filter.
> >
> >Reducing the delay to 50ms means at least it finishes in a reasonable time:
> >
> >$ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img \
> >    delay-read=50ms \
> >    --run 'virt-inspector --format=raw -a "$uri"'
> >
> >real    5m16.298s
> >user    0m0.509s
> >sys     0m2.894s
> >
> >In the above scenario the cache filter is not actually doing anything
> >(since virt-inspector does not write).  Adding cache-on-read=true lets
> >us cache the reads, avoiding going through the "slow" plugin in many
> >cases, and the result is a lot better:
> >
> >$ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img \
> >    delay-read=50ms cache-on-read=true \
> >    --run 'virt-inspector --format=raw -a "$uri"'
> >
> >real    0m27.731s
> >user    0m0.304s
> >sys     0m1.771s
> >
> >However this is still slower than the old method which used qcow2 +
> >qemu's copy-on-read.  It's harder to demonstrate this, but I modified
> >virt-inspector to use the copy-on-read setting (which it doesn't do
> >normally).  On top of nbdkit with 50ms delay and no other filters:
> >
> >qemu + copy-on-read backed by nbdkit delay-read=50ms file:
> >real    0m23.251s
> >
> >So 23s is the time to beat.  (I believe that with longer delays, the
> >gap between qemu and nbdkit increases in favour of qemu.)
> >
> >Q1: What other ideas could we explore to improve performance?
> >
> 
> First thing that came to mind: Could it be that QEMU's cache-on-read
> caches maybe bigger blocks making it effectively do some small
> read-ahead as well?

Not since the changes yesterday[1].  Before those commits went in,
both the cow and cache filters were actually splitting up all requests
to the plugin into 4K blocks.  This was very pessimal.  After, we
coalesce blocks which have the same state which in practice means that
requests are not split.  In addition, we now have a block size of 64K
so smaller requests are turned into 64K requests to the plugin.

In the qemu case, qcow2's 32K cluster size means that 32K requests are
always being made to the plugin.

[1] https://listman.redhat.com/archives/libguestfs/2021-July/msg00044.html

> >- - -
> >
> >In real scenarios we'll actually want to combine cow + cache, where
> >cow is caching writes, and cache is caching reads.
> >
> > socket <- cow filter <- cache filter   <-  nbdkit
> >                      cache-on-read=true   curl|vddk
> >
> >The cow filter is necessary to prevent changes being written back to
> >the pristine source image.
> >
> >This is actually surprisingly efficient, making no noticable
> >difference in this test:
> >
> >time ./nbdkit --filter=cow --filter=cache --filter=delay \
> >    file /var/tmp/fedora-33.img \
> >    delay-read=50ms cache-on-read=true \
> >    --run 'virt-inspector --format=raw -a "$uri"'
> >
> >real	0m27.193s
> >user	0m0.283s
> >sys	0m1.776s
> >
> >Q2: Should we consider a "cow-on-read" flag to the cow filter (thus
> >removing the need to use the cache filter at all)?
> >
> 
> That would make at least some sense since there is cow-on-cache already
> (albeit a little confusing for me personally).

I forgot about that one.  cow-on-read would work similarly.

> I presume it would not increase the size of the difference (when
> using qemu-img rebase) at all, right?  I do not see however how it
> would be faster than the existing:
>
>   cow <- cache[cache-on-read]

It should be simpler.  I'm also worried that I'm missing something and
that maybe cow + cache + cache-on-read is actually double-caching in
some case that I didn't think of.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW