[libvirt] [PATCH v2 3/3] qemu: Enable NUMA node tag in pci-root for PPC64
Shivaprasad G Bhat
sbhat at linux.vnet.ibm.com
Fri Jul 21 07:24:06 UTC 2017
Hi Andrea,
On 07/20/2017 09:16 PM, Andrea Bolognani wrote:
> On Mon, 2017-07-17 at 21:29 +0530, Shivaprasad G Bhat wrote:
>> @@ -3786,6 +3786,11 @@
>> part of the specified NUMA node (it is up to the user of the
>> libvirt API to attach host devices to the correct
>> pci-expander-bus when assigning them to the domain).
>> + On PPC64, the PCI devices can be specified to be part of a NUMA
>> + node using only the pci-root controller with an optional
>> + <code><node></code> subelement within the
>> + <code><target></code> subelement. The PCI devices on the
>> + given pci-root controller will be part of the specified NUMA node.
> Instead of adding an entire new sentence here, it would make
> more sense to rephrase what's already present, something like:
>
> Some PCI controllers (pci-expander-bus for the pc machine
> type, pcie-expander-bus for the q35 machine type and
> pci-root for the pseries machine type) can have an optional
> <node> subelement [...]
>> @@ -9457,6 +9457,12 @@ virDomainControllerDefParseXML(xmlNodePtr node,
>> goto error;
>> }
>> }
>> + if (def->idx == 0 && numaNode >= 0) {
>> + virReportError(VIR_ERR_XML_ERROR, "%s",
>> + _("Only the PCI controller with index != 0 can "
>> + "have NUMA node property specified"));
>> + goto error;
>> + }
>> if (numaNode >= 0)
>> def->opts.pciopts.numaNode = numaNode;
> The check you're adding can be merged with the one below it.
>
> The error message should also be reworded, something like:
>
> The PCI controller with index=0 can't be associated with
> a NUMA node.
>> @@ -3458,9 +3458,14 @@ qemuDomainControllerDefPostParse(virDomainControllerDefPtr cont,
>> * that NUMA node is configured in the guest <cpu><numa>
>> * array. NUMA cell id's in this array are numbered
>> * from 0 .. size-1.
>> + *
>> + * On PSeries, the NUMA node is set at the PHB.
> As above, reworking the existing comment would work better
> than merely appending to it.
>
>> */
>> - if ((cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_EXPANDER_BUS ||
>> - cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_EXPANDER_BUS) &&
>> + if (((qemuDomainIsPSeries(def) &&
>> + cont->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI &&
>> + cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) ||
>> + (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_EXPANDER_BUS ||
>> + cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_EXPANDER_BUS)) &&
>> (int) virDomainNumaGetNodeCount(def->numa)
>> <= cont->opts.pciopts.numaNode) {
> I actually don't see any point in the condition being this
> complex: it could just be
>
> if (cont->opts.pciopts.numaNode >= 0 &&
> cont->opts.pciopts.numaNode >=
> (int) virDomainNumaGetNodeCount(def->numa))
>
> because we've already made sure, while parsing, that numaNode
> is only set for controllers that allow it.
>
>> +++ b/tests/qemuxml2argvdata/qemuxml2argv-spapr-pci-host-bridge-numa-node.xml
>> @@ -0,0 +1,54 @@
>> +<domain type='qemu'>
>> + <name>QEMUGuest1</name>
>> + <uuid>87eedafe-eedc-4336-8130-ed9fe5dc90c8</uuid>
>> + <memory unit='KiB'>2097152</memory>
>> + <currentMemory unit='MiB'>2048</currentMemory>
> You don't need <currentMemory>
>
>> + <vcpu placement='static'>8</vcpu>
>> + <numatune>
>> + <memory mode='strict' nodeset='1'/>
>> + </numatune>
>> + <cpu>
>> + <topology sockets='3' cores='1' threads='8'/>
>> + <numa>
>> + <cell id='0' cpus='0-3' memory='1048576' unit='KiB'/>
>> + <cell id='1' cpus='4-7' memory='1048576' unit='KiB'/>
>> + </numa>
>> + </cpu>
>> + <os>
>> + <type arch='ppc64' machine='pseries'>hvm</type>
>> + <boot dev='hd'/>
> <boot> is unnecessary.
>
>> + </os>
>> + <clock offset='utc'/>
>> + <on_poweroff>destroy</on_poweroff>
>> + <on_reboot>restart</on_reboot>
>> + <on_crash>destroy</on_crash>
> <clock> and <on_*> can be removed.
>
>> + <devices>
>> + <emulator>/usr/bin/qemu-system-ppc64</emulator>
>> + <disk type='block' device='disk'>
>> + <driver name='qemu' type='raw'/>
>> + <source dev='/dev/HostVG/QEMUGuest1'/>
>> + <target dev='hda' bus='scsi'/>
>> + <address type='drive' controller='0' bus='0' target='0' unit='0'/>
>> + </disk>
> <disk> is irrelevant for the test case at hand.
>
>> + <controller type='usb' index='0'/>
> The model should be 'none'.
>
>> + <controller type='scsi' index='0'/>
> The SCSI controller is also not useful here.
>
>> + <controller type='pci' index='0' model='pci-root'>
>> + <target index='0'/>
>> + </controller>
>> + <controller type='pci' index='1' model='pci-root'>
>> + <target index='1'>
>> + <node>1</node>
>> + </target>
>> + </controller>
>> + <controller type='pci' index='2' model='pci-root'>
>> + <target index='2'/>
>> + </controller>
>> + <controller type='pci' index='3' model='pci-root'>
>> + <target index='3'>
>> + <node>0</node>
>> + </target>
>> + </controller>
>> + <memballoon model='none'/>
>> + <panic model='pseries'/>
> You can omit <panic>.
>
>> +++ b/tests/qemuxml2argvtest.c
>> @@ -2739,6 +2739,9 @@ mymain(void)
>> DO_TEST_PARSE_ERROR("cpu-cache-emulate-l2", QEMU_CAPS_KVM);
>> DO_TEST_PARSE_ERROR("cpu-cache-passthrough3", QEMU_CAPS_KVM);
>> DO_TEST_PARSE_ERROR("cpu-cache-passthrough-l3", QEMU_CAPS_KVM);
>> + DO_TEST("spapr-pci-host-bridge-numa-node", QEMU_CAPS_NUMA,
>> + QEMU_CAPS_DEVICE_SPAPR_PCI_HOST_BRIDGE,
>> + QEMU_CAPS_SPAPR_PCI_HOST_BRIDGE_NUMA_NODE);
> This should be moved up to be close to the pserie-phb* test
> cases, and renamed to something like 'pseries-phb-numa-node'.
>
> There should be one capability per line.
>
> You also need to have an identical test case in qemuxml2xml,
> and at least one negative test to show that trying to
> assign the default PHB to a NUMA node will result in failure.
Agree to all your comments. Fixing them all.
As I relook at my test case, I realize the numatune should be for the
memnode
instead of memory for this use case. Fixing that as well. For
pci-expander-bus
test cases I see no numatune in the xml, will fix them later.
>
> --
> Andrea Bolognani / Red Hat / Virtualization
>
More information about the libvir-list
mailing list