[vfio-users] MSI Interrupt balance in guests

Colin Godsey crgodsey at gmail.com
Thu Mar 31 20:57:03 UTC 2016


I had some questions about IRQ balancing with MSI interrupts from
passthrough devices using VFIO.

I have a system that is designed to serve as a gaming streaming server for
2 windows VMs each using PCIe passthrough. So far the recurring problem has
been interrupt management. This system ends up being a perfect storm (no
pun indented) for interrupts, having to manage: high performance timers for
video and audio, high data transfer from the GPU (more than normal because
we’re passing 50mbps video stream from the GPU to the system), then 50mbps
per VM transfer back to the host via virtio networking. I’m giving full CPU
mapping to both VMs, with staggered affinity (VM0 VCPU0=CPU0, VM1
VCPU0=CPU2). It is a non-HT intel SMP CPU.

MSI so far seems to be the key to getting this all to work right. Enabling
MSI was the first big thing, guests not detecting it led to really poor
interrupt management, reading more about it explained why.

The low latency kernel seems to be the other part of this- games like to
run on an accurate clock, and context switching seems minimal when the GPU
can drive frame composition like normal, and in the end i think it wastes
less CPU cycles due to accurate media response. Even though this system
deals with large amounts of throughput, latency and device-responsiveness
are still mostly preferred.

This leads to the last piece of the puzzle. My mysterious performance drop
across the board when both systems are under GPU load. Both MSI GPUs seem
to be piling the majority of their interrupts onto the same CPU. The MSI
based ethernet adapter seems to fall on another CPU, so that’s great. But
the GPU seems to just tank overall guest performance and the GPU interrupts
dont seem to want to budge from CPU1 no matter what. There is some
distribution of the GPU interrupts across CPUs, but the bulk seems to
always land on CPU1.

Ultimately what im wondering:
* How does MSI balance? Via VFIO, are the MSI interrupts
assigned/mapped/masked on a guest level, host level, or hardware level?
* Also, how does x2APIC relate to MSI and/or VFIO? I’m overriding my BIOS
opt-out to force x2APIC. Is there any reason to not do this? Does this
interfere with MSI at all? Why would my bios want to disable a much larger
APIC? Can qemu/kvm even utilize this?
* Could my performance issues just be related to context-switching and
CPU/cache bounds? Would the low latency kernel (1000Mhz) make that worse?
If so, is there any way to force MSI affinity?

Thanks tons!

-Colin G
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20160331/91662aef/attachment.htm>


More information about the vfio-users mailing list