[PATCH RFC v2 00/13] IOMMUFD Generic interface

Jason Gunthorpe jgg at nvidia.com
Tue Oct 11 12:30:43 UTC 2022


On Mon, Oct 10, 2022 at 04:54:50PM -0400, Steven Sistare wrote:
> > Do we have a solution to this?
> > 
> > If not I would like to make a patch removing VFIO_DMA_UNMAP_FLAG_VADDR
> > 
> > Aside from the approach to use the FD, another idea is to just use
> > fork.
> > 
> > qemu would do something like
> > 
> >  .. stop all container ioctl activity ..
> >  fork()
> >     ioctl(CHANGE_MM) // switch all maps to this mm
> >     .. signal parent.. 
> >     .. wait parent..
> >     exit(0)
> >  .. wait child ..
> >  exec()
> >  ioctl(CHANGE_MM) // switch all maps to this mm
> >  ..signal child..
> >  waitpid(childpid)
> > 
> > This way the kernel is never left without a page provider for the
> > maps, the dummy mm_struct belonging to the fork will serve that role
> > for the gap.
> > 
> > And the above is only required if we have mdevs, so we could imagine
> > userspace optimizing it away for, eg vfio-pci only cases.
> > 
> > It is not as efficient as using a FD backing, but this is super easy
> > to implement in the kernel.
> 
> I propose to avoid deadlock for mediated devices as follows.  Currently, an
> mdev calling vfio_pin_pages blocks in vfio_wait while VFIO_DMA_UNMAP_FLAG_VADDR
> is asserted.
> 
>   * In vfio_wait, I will maintain a list of waiters, each list element
>     consisting of (task, mdev, close_flag=false).
> 
>   * When the vfio device descriptor is closed, vfio_device_fops_release
>     will notify the vfio_iommu driver, which will find the mdev on the waiters
>     list, set elem->close_flag=true, and call wake_up_process for the task.

This alone is not sufficient, the mdev driver can continue to
establish new mappings until it's close_device function
returns. Killing only existing mappings is racy.

I think you are focusing on the one issue I pointed at, as I said, I'm
sure there are more ways than just close to abuse this functionality
to deadlock the kernel.

I continue to prefer we remove it completely and do something more
robust. I suggested two options.

Jason



More information about the libvir-list mailing list