[libvirt] [Qemu-devel] [PATCH v7 0/4] Add Mediated device support

Kirti Wankhede kwankhede at nvidia.com
Sat Sep 3 17:47:41 UTC 2016



On 9/3/2016 6:37 PM, Paolo Bonzini wrote:
> 
> 
> On 03/09/2016 13:56, John Ferlan wrote:
>> On 09/02/2016 05:48 PM, Paolo Bonzini wrote:
>>> On 02/09/2016 20:33, Kirti Wankhede wrote:
>>>> <Alex> We could even do:
>>>>>>
>>>>>> echo $UUID1:$GROUPA > create
>>>>>>
>>>>>> where $GROUPA is the group ID of a previously created mdev device into
>>>>>> which $UUID1 is to be created and added to the same group.
>>>> </Alex>
>>>
>>> >From the point of view of libvirt, I think I prefer Alex's idea.
>>> <group> could be an additional element in the nodedev-create XML:
>>>
>>>     <device>
>>>       <name>my-vgpu</name>
>>>       <parent>pci_0000_86_00_0</parent>
>>>       <capability type='mdev'>
>>>         <type id='11'/>
>>>         <uuid>0695d332-7831-493f-9e71-1c85c8911a08</uuid>
>>>         <group>group1</group>
>>>       </capability>
>>>     </device>
>>>
>>> (should group also be a UUID?)
>>

I replied to earlier mail too, group number doesn't need to be UUID. It
should be a unique number. I think in the discussion in bof someone
mentioned about using domain's unique number that libvirt generates.
That should also work.

>> As long as create_group handles all the work and all libvirt does is
>> call it, get the return status/error, and handle deleting the vGPU on
>> error, then I guess it's doable.
>>

Yes that is the idea. Libvirt doesn't have to care about the groups.
With Alex's proposal, as you mentioned above, libvirt have to provide
group number to mdev_create, check return status and handle error case.

  echo $UUID1:$GROUP1 > mdev_create
  echo $UUID2:$GROUP1 > mdev_create
would create two mdev devices assigned to same domain.


>> Alternatively having multiple <type id='#'> in the XML and performing a
>> single *mdev/create_group is an option.
> 
> I don't really like the idea of a single nodedev-create creating
> multiple devices, but that would work too.
> 
>> That is, what is the "output" from create_group that gets added to the
>> domain XML?  How is that found?
> 
> A new sysfs path is created, whose name depends on the UUID.  The UUID
> is used in a <hostdev> element in the domain XML and the sysfs path
> appears in the QEMU command line.  Kirti and Neo had examples in their
> presentation at KVM Forum.
> 
> If you create multiple devices in the same group, they are added to the
> same IOMMU group so they must be used by the same VM.  However they
> don't have to be available from the beginning; they could be
> hotplugged/hot-unplugged later, since from the point of view of the VM
> those are just another PCI device.
> 
>> Also, once the domain is running can a
>> vGPU be added to the group?  Removed?  What allows/prevents?
> 
> Kirti?... :)

Yes, vGPU could be hot-plugged or hot-unplugged. This also depends on
does vendor driver want to support that. For example, domain is running
with two vGPUs $UUID1 and $UUID2 and user tried to hot-unplug vGPU
$UUID2, vendor driver knows that domain is running and vGPU is being
used in guest, so vendor driver can fail offline/close() call if they
don't support hot-unplug. Similarly for hot-plug vendor driver can fail
create call to not to support hot-plug.

> 
> In principle I don't think anything should block vGPUs from different
> groups being added to the same VM, but I have to defer to Alex and Kirti
> again on this.
> 

No, there should be one group per VM.

>>> Since John brought up the topic of minimal XML, in this case it will be
>>> like this:
>>>
>>>     <device>
>>>       <name>my-vgpu</name>
>>>       <parent>pci_0000_86_00_0</parent>
>>>       <capability type='mdev'>
>>>         <type id='11'/>
>>>       </capability>
>>>     </device>
>>>
>>> The uuid will be autogenerated by libvirt and if there's no <group> (as
>>> is common for VMs with only 1 vGPU) it will be a single-device group.
>>
>> The <name> could be ignored as it seems existing libvirt code wants to
>> generate a name via udevGenerateDeviceName for other devices. I haven't
>> studied it long enough, but I believe that's how those pci_####* names
>> created.
> 
> Yeah that makes sense.  So we get down to a minimal XML that has just
> parent, and capability with type in it; additional elements could be
> name (ignored anyway), and within capability uuid and group.
>

Yes, this seems good.
I would like to have one more capability here. Pulling here some
suggestion from my previous mail:
In the directory structure, a 'params' can take optional parameters.
Libvirt then can set 'params' and then create mdev device. For example,
param say 'disable_console_vnc=1' is set for type 11, then devices
created of type 11 will have that param set unless it is cleared.

 └── mdev_supported_types
     ├── 11
     │   ├── create
     │   ├── description
     │   └── max_instances
     │   └── params
     ├── 12
     │   ├── create
     │   ├── description
     │   └── max_instances
     │   └── params
     └── 13
         ├── create
         ├── description
         └── max_instances
         └── params

So with that XML format would be:
    <device>
      <name>my-vgpu</name>
      <parent>pci_0000_86_00_0</parent>
       <capability type='mdev'>
        <type id='11'/>
	<group>group1</group>
	<params>disable_console_vnc=1</params>
      </capability>
    </device>

and 'params' field should be just a string to libvirt and its optional
also. If user want to provide extra parameter while creating vGPU device
they should provide it in XML file as above to nodedev-create.
Very initial proposal was to have this extra paramter list as a string
to mdev_create itself as:

  echo $UUID1:$PARAMS > mdev_create

I would like to know others opinions on whether it should be part of
mdev_create input or a separate write to 'params' file in sysfs as in
above directory structure.

Kirti.

> Thanks,
> 
> Paolo
> 




More information about the libvir-list mailing list