[libvirt] [PATCHv2 15/15] qemu: set CAP_COMPROMISE_KERNEL so that pci passthrough works

Laine Stump laine at laine.org
Tue Feb 12 20:15:49 UTC 2013


(NOTE: This patch will hopefully *not* be needed, as some kernel
people are trying to eliminate the need by changing
CAP_COMPROMISE_KERNEL so that it's only checked on open(), not on
read/write. I'm only including it here for completeness, but have no
plans to push it.)

Any system with CAP_COMPROMISE_KERNEL available in the kernel was not
able to perform PCI passthrough device assignment without 1) running
qemu as root *and* 2) setting "clear_emulator_capabilities=0" in
/etc/libvirt/qemu.conf.

This patch is the final piece to make pci passthrough once again work
properly with a non-root qemu. It sets CAP_COMPROMISE_KERNEL; now that
virCommand is properly setup to honor that request for non-root child
processes, it will actually do some good.

It is still necessary to set the file capability for the qemu binary,
however (see the rules for determining effective caps of a process
running as non-root in "man 7 capabilities"). This can be done with:

  filecap $path-to-qemu-binary compromise_kernel
---
Change from V1: rebased.

 src/qemu/qemu_process.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 6c70246..b681ad5 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -3942,6 +3942,17 @@ int qemuProcessStart(virConnectPtr conn,
     if (cfg->clearEmulatorCapabilities)
         virCommandClearCaps(cmd);
 
+#ifdef CAP_COMPROMISE_KERNEL
+    /* On kernels that have this capability, we must set it for qemu
+     * in order for PCI passthrough with -device pci-assign to work
+     * (the qemu binary must also have that *file* capability, set
+     * with "filecap /usr/bin/qemu-kvm compromise_kernel", for
+     * example).  (either that, or qemu must be run as root, with
+     * "clear_emulator_capabilities=0" in /etc/libvirt/qemu.conf).
+     */
+    virCommandAllowCap(cmd, CAP_COMPROMISE_KERNEL);
+#endif
+
     /* in case a certain disk is desirous of CAP_SYS_RAWIO, add this */
     for (i = 0; i < vm->def->ndisks; i++) {
         virDomainDiskDefPtr disk = vm->def->disks[i];
-- 
1.8.1




More information about the libvir-list mailing list