[vfio-users] dev_WARN for hotplugging to live VFIO group

Fri Aug 21 23:11:12 UTC 2020

On 8/21/20 4:43 PM, Alex Williamson wrote:

> When a device is added to a live group there's a risk that it will be
> auto-probed by a host driver, if that occurs then isolation of the
> group has been violated and vfio code will BUG_ON to halt the system.
> The warning is effectively just a notification that we're in a risky
> situation where the wrong driver binding could escalate the issue.
> 
> There is a ToDo in the code at that point to prevent driver probing,
> but ISTR at that time we may not have had a good way to do that.  I'm
> not sure if we do now either.  We have the driver_override field for
> the device that we could write into, but at this point we're looking at
> a generic device, we don't even know that it's a PCI device.  We could
> determine that, but even then it's not clear that the kernel should set
> the policy to define that it should be bound to the vfio-pci driver,
> potentially versus other vfio drivers that could legitimately manage
> the device safely.  If we write a random string to the driver_override
> field we could prevent automatic binding to any driver, but then we put
> a barrier to making use of the device, which seems like it has support
> issues as well.  I'm not sure what the best approach is... that's why
> we currently generate a warning and hope it doesn't happen.

Interesting, it definitely seems like there's no easy generic solution
then.

In my use case, the devices that will be hotplugged have a known
vendor+product ID and are already registered with the vfio-pci driver
via /sys/bus/pci/drivers/vfio-pci/new_id. In this case, it
should be safe to write into driver_override, since the user has
already explicitly stated that they wish to use the vfio-pci driver,
right?

> On a truly bare metal platform, I don't think this should ever occur in
> practice without manually removing and re-scanning devices.  We'd
> expect PCIe hotplug to occur on the slot level with isolation to the
> downstream port providing that slot.  Without that isolation, or the
> increasingly unlikely chance of encountering this with conventional PCI
> hotplug, we'd probably hand wave the system as inappropriate for the
> task.  Here I think you have a bare metal hypervisor exposing portions
> of devices to the "host" in unusual ways that can trigger this and are
> expected to be supported.

Sorry, I should have been more clear. I'm encountering the warning when
hotplugging virtual PCI devices (ivshmem) to the guest which accesses
them from its userspace with VFIO - there's no physical PCI device being
passed through.

I think the QEMU pseries/sPAPR platform differs from conventional X86_64
platforms like Q35 in how it handles hotplug. Specifically, all devices
on a given spapr-pci-host-bridge end up in the same IOMMU group, even
for hotplugged slots. This is why it'd be nice to have a solution to
allow VFIO to handle this gracefully, but it certainly doesn't seem
as straightforward as I'd hoped.

> Sorry, I don't have a good proposal to
> resolve how we should handle group composition changing while the group
> is in use... thus we currently just whine about it with a warning.

Thank you for sharing your thoughts. It sounds like in my case where
the device is known to not be registered with any other kernel drivers,
this warning should be fine to ignore in practice, though it'd definitely
be nice to have a way to suppress it in known-safe situations.

> Thanks,
> 
> Alex
>