[libvirt] pci-stub error and MSI-X for KVM guest

Daniel P. Berrange berrange at redhat.com
Fri Jan 8 11:04:34 UTC 2010

On Thu, Jan 07, 2010 at 04:50:03PM -0800, Chris Wright wrote:
> * Fischer, Anna (anna.fischer at hp.com) wrote:
> > So, when setting a breakpoint for the exit() call I'm getting a bit closer to figuring where it kills my guest.
> Thanks, this helps clarify what is happening.
> > Breakpoint 1, exit (status=1) at exit.c:99
> > 99	{
> > Current language:  auto
> > The current source language is "auto; currently c".
> > (gdb) bt
> > #0  exit (status=1) at exit.c:99
> > #1  0x0000000000470c6e in assigned_dev_pci_read_config (d=0x259c6f0, address=64, len=4)
> assigned_dev_pci_read_config(..., 64, 4)
>                                   ^^
> This is a libvirt issue.  When you use virt-manager it has libvirtd
> fork/exec qemu-kvm.  libvirtd will drop privileges and run qemu-kvm as
> user qemu (or perhaps root if you've edited qemu.conf).  Regardless of
> the user, it clears capabilities.  Reading PCI config space beyond just
> the header requires CAP_SYS_ADMIN.  The above is reading the first 4
> bytes of device dependent config space, and the kernel is returning 0
> because qemu doesn't have CAP_SYS_ADMIN.

Hmm, libvirt also chown()'s the files in /sys/bus/pci/devices/<DEVICE>/*
to 'qemu' (and sets SELinux context) so that the unprivileged QEMU process
can have full read/write access to them. I would have hoped that would
avoid the need to have any capabilities like CAP_SYS_ADMIN :-(

> Basically, this means that device assignment w/ libvirt will break
> MSI/MSI-X because qemu will never be able to see that the host device
> has those PCI capabilities.  This, in turn, renders VF device assignment
> useless (since a VF is required to support MSI and/or MSI-X).
> Granting CAP_SYS_ADMIN for each qemu instance that does device assignment
> would render the privilege reduction useless (CAP_SYS_ADMIN is the
> kitchen sink catchall of the Linux capability system).

Yeah that's pretty troublesome, even when libvirt runs QEMU as 'root', it will
remove all capabilities. Why is the 'CAP_SYS_ADMIN' check there - is it a
mistakenly over-zealous permission check that could be removed, just relying
on access controls on the sysfs /sys/bus/pci/devices/<DEVICE>/config
file ?

|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

More information about the libvir-list mailing list