[libvirt] [RFC PATCH v2 2/4] qemu: parse spapr-vfio-pci controller from xml
Shivaprasad bhat
shivaprasadbhat at gmail.com
Mon Mar 9 09:34:10 UTC 2015
HI Jan,
The spapr-vfio-pci-host-bridge controller is being reworked in
qemu-ppc to handle all the iommu
groups generically(i.e iommu group need not be mentioned in the CLI).
https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-February/124922.html
This would help avoiding the sysfs scanning and allows further simplification
of libvirt patches.
I will post my patches post testing once the qemu changes are merged.
Thanks,
Shiva
On Tue, Dec 9, 2014 at 3:21 PM, Shivaprasad bhat
<shivaprasadbhat at gmail.com> wrote:
> On Mon, Dec 8, 2014 at 9:01 PM, Ján Tomko <jtomko at redhat.com> wrote:
>> This has a qemu: prefix, even though the parsing is done in conf:
>>
>> It would look nicer split into several commits:
>> 1) parsing the new controller type (without auto-adding them) + qemuxml2xml
>> test of the controllers
>> 2) auto-adding the controllers
>> 3) adding the 0 domain parameter to many functions (if needed)
>>
>
> I added the prefix to mention this is for qemu only. With the
> splitting that you are
> suggesting, I think I can be more specific.
>
>
>> On 11/17/2014 11:04 AM, Shivaprasad G Bhat wrote:
>>> This patch will get the iommu group for host devices by XML configuration
>>> of the vfio host bridge controller or from the sysfs.
>>>
>>> On pseries, the iommu group can be shared by multiple host devices and
>>> they would share the same spapr-vfio-host-bus controller. A new
>>> controller is added for every new iommu group. Every spapr-vfio-host-bridge
>>> in the cli creates a new pci domain in the guest. For Example,
>>> -device spapr-pci-vfio-host-bridge,iommu=1,id=SOMEDOMAIN,index=1
>>> The "SOMEDOMAIN" is the id for new pci domain inside guest.
>>>
>>> spapr-pci-vfio-host-bridge is actually a PCI host bridge with VFIO support.
>>> It hosts pci-bridges, and has all the features/behaviours similar to the
>>> default emulated pci host bridge. The controller can host both pci and pcie
>>> devices. For convenience, this root controller uses model "pci-root".
>>>
>>> The sample controller tags would look like below:
>>> <controller type='spapr-vfio-pci' index='0' model='pci-root'
>>> iommuGroupNum='3' domain='1'/>
>>> <controller type='spapr-vfio-pci' index='1' model='pci-bridge'
>>> iommuGroupNum='3' domain='1'>
>>
>> Would it make sense to fill out the iommuGroup on domain startup, depending on
>> the devices that were assigned there?
>>
>> So that libvirt user can do:
>> <controller type='spapr-vfio-pci' domain='1'/>
>> <controller type='spapr-vfio-pci' domain='2'/>
>>
>> Then attach two hostdevs to these domains without knowing their IOMMU group?
>>
>> (Although the user needs to know which devices are in the same group to attach
>> more than one to the same domain)
>
> I have added the iommu group to the controller to have 1:1 mapping of
> domain to iommu. This helps in validating the Addresses to say if a
> device is assigned to the correct domain address. (patch 3 does this).
>
> As you suggested below in one of the comments, If the iommu group detection be
> delayed till the process start, I think we can do away with iommu group
> in the controller xml. I'll see how I can avoid this in v3.
>
>>
>>> <address type='pci' domain='0x0001' bus='0x00' slot='0x02'
>>> function='0x0'/>
>>> </controller>
>>> <controller type='spapr-vfio-pci' index='0' model='pci-root'
>>> iommuGroupNum='13' domain='2'/>
>>>
>>> Like other architectures, unassigned and unmanaged devices in the iommu group
>>> need to be detached manually before the guest is created .
>>>
>>> The spapr-vfio-pci controllers are removed when there are no corresponding
>>> hostdevices in xml.
>>>
>>> Signed-off-by: Shivaprasad G Bhat <sbhat at linux.vnet.ibm.com>
>>> Signed-off-by: Pradipta Kumar Banerjee <bpradip at in.ibm.com>
>>> Reviewed-by: Prerna Saxena <prerna at linux.vnet.ibm.com>
>>> ---
>>> docs/schemas/domaincommon.rng | 28 +++++++
>>> src/bhyve/bhyve_domain.c | 2
>>> src/conf/domain_conf.c | 165 +++++++++++++++++++++++++++++++++++++++--
>>> src/conf/domain_conf.h | 19 +++++
>>> src/libvirt_private.syms | 1
>>> src/qemu/qemu_command.c | 4 +
>>> src/qemu/qemu_domain.c | 12 +--
>>> src/qemu/qemu_driver.c | 6 +
>>> 8 files changed, 222 insertions(+), 15 deletions(-)
>>>
>>
>>
>>> @@ -12113,6 +12165,95 @@ virDomainResourceDefParse(xmlNodePtr node,
>>> return NULL;
>>> }
>>>
>>> +int
>>> +virDomainDefMaybeAddHostdevSpaprPCIVfioControllers(virDomainDefPtr def)
>>> +{
>>> + size_t i, j;
>>> + virDomainHostdevDefPtr hostdev;
>>> + virDomainControllerDefPtr controller;
>>> + int ret = -1;
>>> + int maxDomainId = 0;
>>> + int skip;
>>> +
>>> + if (!(ARCH_IS_PPC64(def->os.arch)) ||
>>> + !(def->os.machine && STRPREFIX(def->os.machine, "pseries")))
>>> + return 0;
>>> +
>>> + for (i = 0; i < def->nhostdevs; i++) {
>>> + hostdev = def->hostdevs[i];
>>> + if (IS_PCI_VFIO_HOSTDEV(hostdev))
>>> + hostdev->source.subsys.u.pci.iommu = -1;
>>> + }
>>> + /* The hostdevs belonging to same iommu are
>>> + * all part of same domain.
>>> + */
>>> + for (i = 0; i < def->ncontrollers; i++) {
>>> + controller = def->controllers[i];
>>> + if (controller->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO &&
>>> + controller->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT)
>>> + for (j = 0; j < def->nhostdevs; j++) {
>>> + hostdev = def->hostdevs[j];
>>> + if (IS_PCI_VFIO_HOSTDEV(hostdev))
>>> + if (hostdev->info->addr.pci.domain == controller->domain)
>>> + hostdev->source.subsys.u.pci.iommu = controller->opts.spaprvfio.iommuGroupNum;
>>> + }
>>> + if (controller->domain > maxDomainId)
>>> + maxDomainId = controller->domain;
>>> + }
>>> + /* If the spapr-vfio controller doesnt exist for the hostdev
>>> + * add a controller for that iommu group.
>>> + */
>>> + for (i = 0; i < def->nhostdevs; i++) {
>>> + skip = 0;
>>> + hostdev = def->hostdevs[i];
>>> + if (IS_PCI_VFIO_HOSTDEV(hostdev)) {
>>> + virPCIDeviceAddressPtr addr;
>>> + int iommu = -1;
>>> + if (hostdev->source.subsys.u.pci.iommu == -1) {
>>> + addr = (virPCIDeviceAddressPtr)&hostdev->source.subsys.u.pci.addr;
>>> + if ((iommu = virPCIDeviceAddressGetIOMMUGroupNum(addr)) < 0)
>>> + goto error;
>>
>> I don't know if we should access /sysfs on XML parsing. Could we add the
>> controllers at startup instead?
>
> Are you saying we auto-add the controller around, qemuProcessStart OR
> qemuPrepareHostDevices()
> function calls? The normal domxml-to-native would still need these
> controllers right ?
>
>>
>>> + hostdev->source.subsys.u.pci.iommu = iommu;
>>> +
>>> + for (j = 0; j < def->ncontrollers; j++) {
>>> + controller = def->controllers[j];
>>> + if (controller->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO &&
>>> + controller->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) {
>>> + if (iommu == controller->opts.spaprvfio.iommuGroupNum)
>>> + skip = 1;
>>> + }
>>> + }
>>> + if (skip)
>>> + continue;
>>> + if (virDomainDefMaybeAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO,
>>> + ++maxDomainId, 0, VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) < 0)
>>> + goto error;
>>> + def->controllers[def->ncontrollers-1]->opts.spaprvfio.iommuGroupNum = iommu;
>>> + }
>>> + }
>>> + }
>>> +
>>> + /* Remove redundant controllers which dont serve any hostdevs. */
>>
>> We don't delete unused controllers of other types, what if the user explicitly
>> requests them?
>
> If there are stale controllers, the iommu group which they serve would
> get busy, as the
> guest would start using them even though there are no hostdevs from
> that group. Since
> we auto-add them, I felt its better to remove them if unnecessary.
>
>>
>> Jan
>>
>>> + for (i = 0; i < def->ncontrollers; i++) {
>>> + controller = def->controllers[i];
>>> + if (controller->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) {
>>> + int usedController = 0;
>>> + for (j = 0; j < def->nhostdevs; j++) {
>>> + hostdev = def->hostdevs[j];
>>> + if (IS_PCI_VFIO_HOSTDEV(hostdev))
>>> + if (hostdev->source.subsys.u.pci.iommu == controller->opts.spaprvfio.iommuGroupNum)
>>> + usedController = 1;
>>> + }
>>
>>
More information about the libvir-list
mailing list