[Libguestfs] Scaling virt-df performance

Wed Sep 10 19:35:03 UTC 2014

Hi Richard,

Thanks a lot for the response - very helpful to go through.

I'm using libguestfs 1.26.5 on an Ubuntu 14.04 OS, which is running as a baremetal server. I was simply using "virt-df", but after looking into the "-P" option a little more I have incorporated it. That greatly improves the performance compared to the original (I checked the free memory available and it ranged from 2-3G, which wouldn't have given me as many parallel threads running). Question about this - is it safe to enforce a larger threadcount without checking available memory first? 

I ran the baselines and got the following:
	Starting the appliance = ~3.4s
	Performing inspection of a guest = ~5s

I also looked at your blog posts - very interesting stuff. I played with setting "LIBGUESTFS_ATTACH_METHOD=appliance", but didn't notice much difference here. I'm testing on a Qemu-KVM/Openstack with 29 guests running. KVM has the default client connections as 20, so I expected to see a little improvement when setting the above env var. After setting, I changed "-P" to 29 instead of 20, but didn't see any difference. 

Any additional suggestions as the scale dramatically increases to > 3000 guests (This will likely be on a system with much more available memory)? Ideally we would like to gather guest disk used/free <5 minute intervals for all guests - do you think this is possible using virt-df?

Thanks!
Dan Ryder

-----Original Message-----
From: Richard W.M. Jones [mailto:rjones at redhat.com] 
Sent: Wednesday, September 10, 2014 12:32 PM
To: Dan Ryder (daryder)
Cc: libguestfs at redhat.com
Subject: Re: [Libguestfs] Scaling virt-df performance

On Wed, Sep 10, 2014 at 01:38:16PM +0000, Dan Ryder (daryder) wrote:
> Hello,
>
> I have been looking at the "virt-df" libguestfs tool to get 
> guest-level disk used/free statistics - specifically with 
> Qemu-KVM/Openstack. This works great for a few Openstack instances, 
> but when I begin to scale (even to ~30 instances/guests) the 
> performance really takes a hit. The time it takes for the command to 
> complete seems to scale linearly with the amount of guests/domains 
> running on the hypervisor (note - I am using "virt-df" for all guests, 
> not specifying one at a time; although I've tried that, too).
>
> For ~30 guests, the "virt-df" command takes around 90 seconds to 
> complete. We are looking to support a scale of 3,000-30,000 guests 
> disk used/free. It looks like this won't be remotely possible using 
> "virt-df".

With sufficient memory, non-nested, on hardware built in the last 3 years, you should get performance of about 1 second / guest (pipelined).  So the figure you give is about 3 times too high than where it should be.

Just to get some basic things out of the way:

- What version of virt-df are you using and on what distro?

- Is this nested?

- What exact virt-df command(s) are you running?

- Are you using the -P option?

- How much free memory is on the system?  virt-df runs multiple
  threads in parallel, but the number to run is computed according to
  the amount of free memory[1].

Also have a look at the guestfs-performance manual[2].  I'm especially interested in the results of the baseline measurements on that page, but the rest of the page should answer some of your questions too.
There's also an interesting Perl script you might try playing with.

Rich.

[1] https://github.com/libguestfs/libguestfs/blob/master/df/estimate-max-threads.c
[2] http://libguestfs.org/guestfs-performance.1.html

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW