[Libguestfs] FYI: perf commands I'm using to benchmark nbdcopy

Richard W.M. Jones rjones at redhat.com
Wed May 26 10:25:43 UTC 2021


On Wed, May 26, 2021 at 10:32:08AM +0100, Richard W.M. Jones wrote:
> On Wed, May 26, 2021 at 11:40:11AM +0300, Nir Soffer wrote:
> > On Tue, May 25, 2021 at 9:06 PM Richard W.M. Jones <rjones at redhat.com> wrote:
> > > I ran perf as below.  Although nbdcopy and nbdkit themselves do not
> > > require root (and usually should _not_ be run as root), in this case
> > > perf must be run as root, so everything has to be run as root.
> > >
> > >   # perf record -a -g --call-graph=dwarf ./nbdkit -U - sparse-random size=1T --run "MALLOC_CHECK_= ../libnbd/run nbdcopy \$uri \$uri"
> > 
> > This uses 64 requests with a request size of 32m. In my tests using
> > --requests 16 --request-size 1048576 is faster. Did you try to profile
> > this?
> 
> Interesting!  No I didn't.  In fact I just assumed that larger request
> sizes / number of parallel requests would be better.

This is the topology of the machine I ran the tests on:

  https://rwmj.files.wordpress.com/2019/09/screenshot_2019-09-04_11-08-41.png

Even a single 32MB buffer isn't going to fit in any cache, so reducing
buffer size should be a win, and once they are within the size of the
L3 cache, reusing buffers should also be a win.

That's the theory anyway ...  Using --request-size=1048576 changes the
flamegraph quite dramatically (see new attachment).

[What is the meaning of the swapper stack traces?  They are coming
from idle cores?]

Test runs slightly faster:

  $ hyperfine 'nbdkit -U - sparse-random size=1T --run "nbdcopy \$uri \$uri"'
  Benchmark #1: nbdkit -U - sparse-random size=1T --run "nbdcopy \$uri \$uri"
    Time (mean ± σ):     47.407 s ±  0.953 s    [User: 347.982 s, System: 276.220 s]
    Range (min … max):   46.474 s … 49.373 s    10 runs
 
  $ hyperfine 'nbdkit -U - sparse-random size=1T --run "nbdcopy --request-size=1048576 \$uri \$uri"'
  Benchmark #1: nbdkit -U - sparse-random size=1T --run "nbdcopy --request-size=1048576 \$uri \$uri"
    Time (mean ± σ):     43.796 s ±  0.799 s    [User: 328.134 s, System: 252.775 s]
    Range (min … max):   42.289 s … 44.917 s    10 runs

(Note the buffers are still not being reused.)

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nbdcopy3.svg.xz
Type: application/x-xz
Size: 28744 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20210526/84fd7c76/attachment.xz>


More information about the Libguestfs mailing list