[libvirt] [PATCH v2 3/3] qemu: Enable NUMA node tag in pci-root for PPC64

Shivaprasad G Bhat sbhat at linux.vnet.ibm.com
Fri Jul 21 07:24:06 UTC 2017


Hi Andrea,


On 07/20/2017 09:16 PM, Andrea Bolognani wrote:
> On Mon, 2017-07-17 at 21:29 +0530, Shivaprasad G Bhat wrote:
>> @@ -3786,6 +3786,11 @@
>>            part of the specified NUMA node (it is up to the user of the
>>            libvirt API to attach host devices to the correct
>>            pci-expander-bus when assigning them to the domain).
>> +        On PPC64, the PCI devices can be specified to be part of a NUMA
>> +        node using only the pci-root controller with an optional
>> +        <code><node></code> subelement within the
>> +        <code><target></code> subelement. The PCI devices on the
>> +        given pci-root controller will be part of the specified NUMA node.
> Instead of adding an entire new sentence here, it would make
> more sense to rephrase what's already present, something like:
>
>    Some PCI controllers (pci-expander-bus for the pc machine
>    type, pcie-expander-bus for the q35 machine type and
>    pci-root for the pseries machine type) can have an optional
>    <node> subelement [...]
>> @@ -9457,6 +9457,12 @@ virDomainControllerDefParseXML(xmlNodePtr node,
>>                    goto error;
>>                }
>>            }
>> +        if (def->idx == 0 && numaNode >= 0) {
>> +            virReportError(VIR_ERR_XML_ERROR, "%s",
>> +                           _("Only the PCI controller with index != 0 can "
>> +                             "have NUMA node property specified"));
>> +                goto error;
>> +        }
>>            if (numaNode >= 0)
>>                def->opts.pciopts.numaNode = numaNode;
> The check you're adding can be merged with the one below it.
>
> The error message should also be reworded, something like:
>
>    The PCI controller with index=0 can't be associated with
>    a NUMA node.
>> @@ -3458,9 +3458,14 @@ qemuDomainControllerDefPostParse(virDomainControllerDefPtr cont,
>>             * that NUMA node is configured in the guest <cpu><numa>
>>             * array. NUMA cell id's in this array are numbered
>>             * from 0 .. size-1.
>> +         *
>> +         * On PSeries, the NUMA node is set at the PHB.
> As above, reworking the existing comment would work better
> than merely appending to it.
>
>>             */
>> -        if ((cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_EXPANDER_BUS ||
>> -             cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_EXPANDER_BUS) &&
>> +        if (((qemuDomainIsPSeries(def) &&
>> +              cont->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI &&
>> +              cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) ||
>> +             (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_EXPANDER_BUS ||
>> +              cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_EXPANDER_BUS)) &&
>>                (int) virDomainNumaGetNodeCount(def->numa)
>>                <= cont->opts.pciopts.numaNode) {
> I actually don't see any point in the condition being this
> complex: it could just be
>
>    if (cont->opts.pciopts.numaNode >= 0 &&
>        cont->opts.pciopts.numaNode >=
>        (int) virDomainNumaGetNodeCount(def->numa))
>
> because we've already made sure, while parsing, that numaNode
> is only set for controllers that allow it.
>
>> +++ b/tests/qemuxml2argvdata/qemuxml2argv-spapr-pci-host-bridge-numa-node.xml
>> @@ -0,0 +1,54 @@
>> +<domain type='qemu'>
>> +  <name>QEMUGuest1</name>
>> +  <uuid>87eedafe-eedc-4336-8130-ed9fe5dc90c8</uuid>
>> +  <memory unit='KiB'>2097152</memory>
>> +  <currentMemory unit='MiB'>2048</currentMemory>
> You don't need <currentMemory>
>
>> +  <vcpu placement='static'>8</vcpu>
>> +  <numatune>
>> +    <memory mode='strict' nodeset='1'/>
>> +  </numatune>
>> +  <cpu>
>> +    <topology sockets='3' cores='1' threads='8'/>
>> +    <numa>
>> +      <cell id='0' cpus='0-3' memory='1048576' unit='KiB'/>
>> +      <cell id='1' cpus='4-7' memory='1048576' unit='KiB'/>
>> +    </numa>
>> +  </cpu>
>> +  <os>
>> +    <type arch='ppc64' machine='pseries'>hvm</type>
>> +    <boot dev='hd'/>
> <boot> is unnecessary.
>
>> +  </os>
>> +  <clock offset='utc'/>
>> +  <on_poweroff>destroy</on_poweroff>
>> +  <on_reboot>restart</on_reboot>
>> +  <on_crash>destroy</on_crash>
> <clock> and <on_*> can be removed.
>
>> +  <devices>
>> +    <emulator>/usr/bin/qemu-system-ppc64</emulator>
>> +    <disk type='block' device='disk'>
>> +      <driver name='qemu' type='raw'/>
>> +      <source dev='/dev/HostVG/QEMUGuest1'/>
>> +      <target dev='hda' bus='scsi'/>
>> +      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
>> +    </disk>
> <disk> is irrelevant for the test case at hand.
>
>> +    <controller type='usb' index='0'/>
> The model should be 'none'.
>
>> +    <controller type='scsi' index='0'/>
> The SCSI controller is also not useful here.
>
>> +    <controller type='pci' index='0' model='pci-root'>
>> +      <target index='0'/>
>> +    </controller>
>> +    <controller type='pci' index='1' model='pci-root'>
>> +      <target index='1'>
>> +        <node>1</node>
>> +      </target>
>> +    </controller>
>> +    <controller type='pci' index='2' model='pci-root'>
>> +      <target index='2'/>
>> +    </controller>
>> +    <controller type='pci' index='3' model='pci-root'>
>> +      <target index='3'>
>> +        <node>0</node>
>> +      </target>
>> +    </controller>
>> +    <memballoon model='none'/>
>> +    <panic model='pseries'/>
> You can omit <panic>.
>
>> +++ b/tests/qemuxml2argvtest.c
>> @@ -2739,6 +2739,9 @@ mymain(void)
>>        DO_TEST_PARSE_ERROR("cpu-cache-emulate-l2", QEMU_CAPS_KVM);
>>        DO_TEST_PARSE_ERROR("cpu-cache-passthrough3", QEMU_CAPS_KVM);
>>        DO_TEST_PARSE_ERROR("cpu-cache-passthrough-l3", QEMU_CAPS_KVM);
>> +    DO_TEST("spapr-pci-host-bridge-numa-node", QEMU_CAPS_NUMA,
>> +            QEMU_CAPS_DEVICE_SPAPR_PCI_HOST_BRIDGE,
>> +            QEMU_CAPS_SPAPR_PCI_HOST_BRIDGE_NUMA_NODE);
> This should be moved up to be close to the pserie-phb* test
> cases, and renamed to something like 'pseries-phb-numa-node'.
>
> There should be one capability per line.
>
> You also need to have an identical test case in qemuxml2xml,
> and at least one negative test to show that trying to
> assign the default PHB to a NUMA node will result in failure.
Agree to all your comments. Fixing them all.

As I relook at my test case, I realize the numatune should be for the 
memnode
instead of memory for this use case. Fixing that as well. For 
pci-expander-bus
test cases I see no numatune in the xml, will fix them later.

>
> -- 
> Andrea Bolognani / Red Hat / Virtualization
>




More information about the libvir-list mailing list