[Libguestfs] Scaling virt-df performance

Richard W.M. Jones rjones at redhat.com
Wed Sep 10 20:25:51 UTC 2014


On Wed, Sep 10, 2014 at 07:35:03PM +0000, Dan Ryder (daryder) wrote:
> Hi Richard,
> 
> Thanks a lot for the response - very helpful to go through.

> I'm using libguestfs 1.26.5 on an Ubuntu 14.04 OS, which is running
> as a baremetal server. I was simply using "virt-df", but after
> looking into the "-P" option a little more I have incorporated
> it. That greatly improves the performance compared to the original
> (I checked the free memory available and it ranged from 2-3G, which
> wouldn't have given me as many parallel threads running). Question
> about this - is it safe to enforce a larger threadcount without
> checking available memory first?

Ah well, that's a question.

You can control the amount of memory given to the appliance by setting
LIBGUESTFS_MEMSIZE (integer number in megabytes, default 500).
However that doesn't mean that the appliance uses 500 MB -- it's just
malloc'd memory which is created on demand, and likely a very
short-running appliance like the one used by virt-df hardly touches
much memory.

A tool like perf, or even just getrusage(2), might help to understand
the memory requirements of each instance, and then you just multiply
by the number of threads (minus a tiny bit because qemu's code is
shared).

You could crank down LIBGUESTFS_MEMSIZE to the smallest size you think
you can get away with, and consequently run lots more threads.

> I ran the baselines and got the following:
> 	Starting the appliance = ~3.4s
> 	Performing inspection of a guest = ~5s

Those are about right.

Notes:

(1) virt-df doesn't use inspection.

(2) Parallelized virt-df is, amortized, much faster.  It can usually
clear guests at a rate of greater than 1 per second on my laptop.

> I also looked at your blog posts - very interesting stuff. I played
> with setting "LIBGUESTFS_ATTACH_METHOD=appliance", but didn't notice
> much difference here.

On Ubuntu, that's the default!  However even on Fedora, although this
avoids the overhead of libvirt, TBH any overhead is hardly noticable.

You can find the default setting by doing:

  guestfish get-backend

(which on Ubuntu should print 'direct', which is the new name of
'appliance' and is exactly the same mode).

> I'm testing on a Qemu-KVM/Openstack with 29 guests running. KVM has
> the default client connections as 20, so I expected to see a little
> improvement when setting the above env var. After setting, I changed
> "-P" to 29 instead of 20, but didn't see any difference.

I wasn't aware that KVM had a limit?  libvirt has a client connection
limit of 20 (can be adjusted in /etc/libvirt/libvirtd.conf), but
you're probably not using libvirt.

> Any additional suggestions as the scale dramatically increases to >
> 3000 guests (This will likely be on a system with much more
> available memory)? Ideally we would like to gather guest disk
> used/free <5 minute intervals for all guests - do you think this is
> possible using virt-df?

You'll have to run the memory numbers above.  If you have lots of
memory, then maybe yes, but it won't be efficient.

You might consider using one appliance per lots of guests.  With
virtio-scsi you can connect up to 254 disks per appliance (we could
actually increase this to effectively infinite with a smallish code
change).  There are problems with this scenario, but it'll be a lot
faster.  The problems are:

 - Won't work if guests have filesystems or LVs with the same UUIDs
   (happens when guests are cloned or run from templates).

 - Also won't work if guests have LVs with the same name.

 - Probably insecure if used with multi-tenancy, since in theory one
   guest could "crack" the appliance and exploit other disks.

 - virt-df doesn't support this so you'd have to write a bit of code
   that calls g.statvfs and prints out the results.  (See attachment)

If you were using libvirt, I could also suggest looking into
hot-plugging, although TBH hot-plug is almost as slow as running
separate appliances.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html
-------------- next part --------------
#!/usr/bin/python

import sys
import guestfs

g = guestfs.GuestFS (python_return_dict=True)

for disk in sys.argv[1:]:
    g.add_drive_opts (disk, readonly=1)

g.launch ()

fses = g.list_filesystems ()
for dev in fses:
    try:
        g.mount (dev, "/")
        stats = g.statvfs ("/")
        print "%s, %s\n" % (dev, stats)
    except:
        pass
    g.umount_all ()


More information about the Libguestfs mailing list