[libvirt] [Qemu-devel] [PATCH v7 0/4] Add Mediated device support

Wed Sep 7 18:17:39 UTC 2016

On Wed, Sep 07, 2016 at 10:44:56AM -0600, Alex Williamson wrote:
> On Wed, 7 Sep 2016 21:45:31 +0530
> Kirti Wankhede <kwankhede at nvidia.com> wrote:
> 
> > To hot-plug mdev device to a domain in which there is already a mdev
> > device assigned, mdev device should be created with same group number as
> > the existing devices are and then hot-plug it. If there is no mdev
> > device in that domain, then group number should be a unique number.
> > 
> > This simplifies the mdev grouping and also provide flexibility for
> > vendor driver implementation.
> 
> The 'start' operation for NVIDIA mdev devices allocate peer-to-peer
> resources between mdev devices.  Does this not represent some degree of
> an isolation hole between those devices?  Will peer-to-peer DMA between
> devices honor the guest IOVA when mdev devices are placed into separate
> address spaces, such as possible with vIOMMU?

Hi Alex,

In reality, the p2p operation will only work under same translation domain.

As we are discussing the multiple mdev per VM use cases, I think we probably
should not just limit it for p2p operation. 

So, in general, the NVIDIA vGPU device model's requirement is to know/register 
all mdevs per VM before opening any those mdev devices.

> 
> I don't particularly like the iommu group solution either, which is why
> in my latest proposal I've given the vendor driver a way to indicate
> this grouping is required so more flexible mdev devices aren't
> restricted by this.  But the limited knowledge I have of the hardware
> configuration which imposes this restriction on NVIDIA devices seems to
> suggest that iommu grouping of these sets is appropriate.  The vfio-core
> infrastructure is almost entirely built for managing vfio group, which
> are just a direct mapping of iommu groups.  So the complexity of iommu
> groups is already handled.  Adding a new layer of grouping into mdev
> seems like it's increasing the complexity further, not decreasing it.

I really appreciate your thoughts on this issue, and consideration of how NVIDIA
vGPU device model works, but so far I still feel we are borrowing a very
meaningful concept "iommu group" to solve an device model issues, which I actually 
hope can be workarounded by a more independent piece of logic, and that is why Kirti is
proposing the "mdev group".

Let's see if we can address your concerns / questions in Kirti's reply.

Thanks,
Neo

> Thanks,
> 
> Alex