NVMe drive PCI passthrough and suprise hotplug

Kalra, Ashish Ashish.Kalra at amd.com
Thu Feb 3 23:25:05 UTC 2022


[AMD Official Use Only]

Hi,
I am using Fedora 33, with the following KVM, qemu and libvirt versions:
QEMU 5.1.0
libvirt 6.6.0
KVM 5.14.18

We have done pass-through of a PCIe NVMe device to the guest running on FC33
using either virt-manager or virsh and then we do the hot-unplug of the device
while it is attached to the guest.

The device is no longer seen on the guest hardware device list on virt-manager
and then we hotplug the device again and we are able to use it on the Host,
but when we try to re-attach it to the guest, we get the following error message:

Requested operation is not valid, PCI device 0000:c4::00.0 is in use by driver QEMU,
Domain fedora 33.

So somehow libvirt still thinks the hot-unplugged device is attached.

Tracing the flow of hot un-plug event from guest to host :

->Guest pcie hotplug support detected the NVMe driver unplug (from guest kernel logs):

pciehp: Slot (0-6): Attention button pressed

pciehp: Slot (0-6): Powering off due to to button press.

-> Also looks like the guest notified Host/KVM (from host kernel logs):

pcieport: 0000:c4:0000.0: pciehp: Slot (208): Card not present

-> Correspondingly, vfio-pci module notified Qemu :

vfio-pci: 0000:c4:0000.0: Relaying device request to user (#0)

-> Then the un-plugged device reset is done.

vfio-pci: vfio_bar_restore: reset recovery - restoring BARs

pci 0000:c4:00.0: Removing from iommu group 105.

-> Next tried to verify if libvirt detected the DELETED_DRIVE event from qemu.

Running SystemTap script to capture events between qemu and libvirt :

stap examples/systemtap/qemu-monitor.stp

When the NVMe drive is attached to VM the following log output is seen from SystemTap:

execute "device-add", driver: "vfio-pci", host: "0000:c4:00.0", id: "hostdev0", bus: "pci.7", addr: "0".

When we hot-unplug the NVMe drive, the following log output is seen from SystemTap:

event: DEVICE_DELETED, device: "hostdev0", path: "/machine/peripheral/hostdev0".

So it looks like that qemu sent the "DEVICE_DELTED" event to libvirt, but libvirt has still not removed the attached
device from its bookeeping list.

I understand there is already a thread from 20202, discussing a similar issue :
https://www.spinics.net/linux/fedora/libvirt-users/msg12590.html

But I am not sure if there is any fix/support added for this recently.

Looking for any feedback related to above and PCI device passthrough and hotplug support.

Thanks,
Ashish
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20220203/13ba9e5e/attachment.htm>


More information about the libvirt-users mailing list