[libvirt] Qemu/KVM guest boots 2x slower with vhost_net

Reeted reeted at shiftmail.org
Tue Oct 4 23:12:47 UTC 2011


Hello all,
for people in qemu-devel list, you might want to have a look at the 
previous thread about this topic, at
http://www.spinics.net/lists/kvm/msg61537.html
but I will try to recap here.

I found that virtual machines in my host booted 2x slower (on average 
it's 2x slower, but probably some parts are at least 3x slower) under 
libvirt compared to manual qemu-kvm launch. With the help of Daniel I 
narrowed it down to the vhost_net presence (default active when launched 
by libvirt) i.e. with vhost_net, boot process is *UNIFORMLY* 2x slower.

The problem is still reproducible on my systems but these are going to 
go to production soon and I am quite busy, I might not have many more 
days for testing left. Might be just next saturday and sunday for 
testing this problem, so if you can write here some of your suggestions 
by saturday that would be most appreciated.


I have performed some benchmarks now, which I hadn't performed in the 
old thread:

openssl speed -multi 2 rsa : (cpu benchmark) show no performance 
difference with or without vhost_net
disk benchmarks : show no performance difference with or without vhost_net
the disk benchmarks were: (both with cache=none and cache=writeback)
dd streaming read
dd streaming write
fio 4k random read in all cases of cache=none, cache=writeback with host 
cache dropped before test, cache=writeback with all fio data in host 
cache (measures context switch)
fio 4k random write

So I couldn't reproduce the problem with any benchmark that came to my mind.

But in the boot process this is very visible.
I'll continue the description below
before that, here are the System Specifications:
---------------------------------------
Host is with kernel 3.0.3 and Qemu-KVM 0.14.1, both vanilla and compiled 
by me.
Libvirt is the version in Ubuntu 11.04 Natty which is 0.8.8-1ubuntu6.5 . 
I didn't recompile this one

VM disks are LVs of LVM on MD raid array.
The problem shows identically on both cache=none and cache=writeback. 
Aio native.

Physical CPUs are: dual westmere 6-core (12 cores total, + hyperthreading)
2 vCPUs per VM.
All VMs are idle or off except the VM being tested.

Guests are:
- multiple Ubuntu 11.04 Natty 64bit with their 2.6.38-8-virtual kernel: 
very-minimal Ubuntu installs with deboostrap (not from the Ubuntu installer)
- one Fedora Core 6 32bit with a 32bit 2.6.38-8-virtual kernel + initrd 
both taken from Ubuntu Natty 32bit (so I could have virtio). Standard 
install (except kernel replaced afterwards).
Always static IP address in all guests
---------------------------------------

All types of guests show this problem, but it is more visible in the FC6 
guest because the boot process is MUCH longer than in the 
debootstrap-installed ubuntus.

Please note that most of boot process, at least from a certain point 
onwards, appears to the eye uniformly 2x or 3x slower under vhost_net, 
and by boot process I mean, roughly, copying by hand from some screenshots:


Loading default keymap
Setting hostname
Setting up LVM - no volume groups found
checking ilesystems... clean ...
remounting root filesystem in read-write mode
mounting local filesystems
enabling local filesystems quotas
enabling /etc/fstab swaps
INIT entering runlevel 3
entering non-interactive startup
Starting sysstat: calling the system activity data collector (sadc)
Starting background readahead

********** starting from here it is everything, or almost everything, 
much slower

Checking for hardware changes
Bringing up loopback interface
Bringing up interface eth0
starting system logger
starting kernel logger
starting irqbalance
starting potmap
starting nfs statd
starting rpc idmapd
starting system message bus
mounting other filesystems
starting PC/SC smart card daemon (pcscd)
starint hidd ... can't open HIDP control socket : address familiy not 
supported by protocol (this is an error due to backporting a new ubuntu 
kernel to FC6)
starting autofs: loading autofs4
starting automount
starting acpi daemon
starting hpiod
starting hpssd
starting cups
starting sshd
starting ntpd
starting sendmail
starting sm-client
startingg console mouse services
starting crond
starting xfs
starting anacron
starting atd
starting youm-updatesd
starting Avahi daemon
starting HAL daemon


 From the point I marked, onwards, most are services, i.e. daemons 
listening from sockets, so I have thought that maybe the binding to a 
socket could have been slower under vhost_net, but trying to put nc in 
listening with: "nc -l 15000" is instantaneous, so I am not sure.

The shutdown of FC6 with basically the same services as above which tear 
down, is *also* much slower on vhost_net.

Thanks for any suggestions
R.




More information about the libvir-list mailing list