[vfio-users] Mitigating CPU Stutter

Sun Sep 18 14:27:41 UTC 2016

There are a lot of things you can do to improve stuttering but what makes the
most difference is probably to dedicate CPUs exclusively for the VM. Pinning
by itself will only prevent the host scheduler from migrating the VM threads
around. Other processes will still compete for CPU time and preempt the VM
threads. Most documentation use the isolcpus kernel parameter for isolating
CPUs but that will permanently reserve CPU cores. A more flexible way is to
use cpuset to achieve the same thing but only when the VM is running.

There are various frontends to cpuset you could use. I use a tool called
cset. It's old and a bit buggy so you might want to look for other solutions.
https://rt.wiki.kernel.org/index.php/Cpuset_Management_Utility/tutorial

Alex Williamson posted about various isolation strategies here:
https://www.redhat.com/archives/vfio-users/2015-September/msg00041.html
I have a i7-4790K with 4 HT cores (8 "cpus") and use the strategy recommended
in that post. I give cpu 0,4 (core 0) to the host and reserve 1-3,5-7
(core 1-3) to the VM. I pin VCPU 0-2 to CPU 1-3 and the emulator to CPU 5-7.
The downside of this is that the VM only get 3 CPUs out of 8 but that hasn't
been a problem for me so far.

Unfortunately cpuset can not migrate kernel threads so they will still
compete with the VM for CPU time. The kernel got some documentation for how
to avoid that:
https://www.kernel.org/doc/Documentation/kernel-per-CPU-kthreads.txt
Basically you'll need to use irqbalanced to make sure only the CPU allocated
to the host service interrupts and also make sure workqueues run on that CPU.
The kernel use workqueue kthreads to do writeback and they cause huge
latencies when the host do large disk writes if they run on the same CPUs as
the VM. They can be migrated using the /sys/devices/virtual/workqueue/cpumask
device file.

The documentation also mention that you can set realtime priority on the VM
to out preempt the kernel threads but this is a bad idea. The NO_HZ_FULL
scheduler mode only works if a single process wants to run on a core. When
the VM thread runs as realtime priority it can starve the kernel threads for
long period of time and the scheduler will turn off NO_HZ_FULL when that
happens since several processes wants to run. To get the full advantage of
NO_HZ_FULL don't use realtime priority.

Turning off power saving probably also helps but I don't know how much. Here
is an article describing how to turn off power save on modern intel cpus with
example code. https://access.redhat.com/articles/65410
Setting the cpufreq scaling governor to performance is not enough.
According to intel's documentation "Long term reliability cannot be assured
unless all the Low-Power Idle States are enabled" but that is probably not a
problem if you only disable power save when the VM is running.

There is my script for setting up all these things and disabling them when
the VM stops. The maximum latency I've seen when using this script is 280us no
matter what I do on the host (compile jobs, torrenting, copying large files,
etc)
http://sprunge.us/JUfS

My libvirt xml.
http://sprunge.us/gFdY