[PATCH 4/5] qemu: Prefer -numa cpu over -numa node,cpus=

Michal Privoznik mprivozn at redhat.com
Fri May 22 16:28:31 UTC 2020


On 5/22/20 6:07 PM, Igor Mammedov wrote:
> On Fri, 22 May 2020 16:14:14 +0200
> Michal Privoznik <mprivozn at redhat.com> wrote:
> 
>> QEMU is trying to obsolete -numa node,cpus= because that uses
>> ambiguous vCPU id to [socket, die, core, thread] mapping. The new
>> form is:
>>
>>    -numa cpu,node-id=N,socket-id=S,die-id=D,core-id=C,thread-id=T
>>
>> which is repeated for every vCPU and places it at [S, D, C, T]
>> into guest NUMA node N.
>>
>> While in general this is magic mapping, we can deal with it.
>> Firstly, with QEMU 2.7 or newer, libvirt ensures that if topology
>> is given then maxvcpus must be sockets * dies * cores * threads
>> (i.e. there are no 'holes').
>> Secondly, if no topology is given then libvirt itself places each
>> vCPU into a different socket (basically, it fakes topology of:
>> [maxvcpus, 1, 1, 1])
>> Thirdly, we can copy whatever QEMU is doing when mapping vCPUs
>> onto topology, to make sure vCPUs don't start to move around.
>>
>> Note, migration from old to new cmd line works and therefore
>> doesn't need any special handling.
>>
>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1678085
>>
>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>> ---
>>   src/qemu/qemu_command.c                       | 108 +++++++++++++++++-
>>   .../hugepages-nvdimm.x86_64-latest.args       |   4 +-
>>   ...memory-default-hugepage.x86_64-latest.args |  10 +-
>>   .../memfd-memory-numa.x86_64-latest.args      |  10 +-
>>   ...y-hotplug-nvdimm-access.x86_64-latest.args |   4 +-
>>   ...ry-hotplug-nvdimm-align.x86_64-latest.args |   4 +-
>>   ...ry-hotplug-nvdimm-label.x86_64-latest.args |   4 +-
>>   ...ory-hotplug-nvdimm-pmem.x86_64-latest.args |   4 +-
>>   ...ory-hotplug-nvdimm-ppc64.ppc64-latest.args |   4 +-
>>   ...hotplug-nvdimm-readonly.x86_64-latest.args |   4 +-
>>   .../memory-hotplug-nvdimm.x86_64-latest.args  |   4 +-
>>   ...vhost-user-fs-fd-memory.x86_64-latest.args |   4 +-
>>   ...vhost-user-fs-hugepages.x86_64-latest.args |   4 +-
>>   ...host-user-gpu-secondary.x86_64-latest.args |   3 +-
>>   .../vhost-user-vga.x86_64-latest.args         |   3 +-
>>   15 files changed, 158 insertions(+), 16 deletions(-)
>>
>> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
>> index 7d84fd8b5e..0de4fe4905 100644
>> --- a/src/qemu/qemu_command.c
>> +++ b/src/qemu/qemu_command.c
>> @@ -7079,6 +7079,91 @@ qemuBuildNumaOldCPUs(virBufferPtr buf,
>>   }
>>   
>>   
>> +/**
>> + * qemuTranlsatevCPUID:
>> + *
>> + * For given vCPU @id and vCPU topology (@cpu) compute corresponding
>> + * @socket, @die, @core and @thread). This assumes linear topology,
>> + * that is every [socket, die, core, thread] combination is valid vCPU
>> + * ID and there are no 'holes'. This is ensured by
>> + * qemuValidateDomainDef() if QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS is
>> + * set.
> I wouldn't make this assumption, each machine can have (and has) it's own layout,
> and now it's not hard to change that per machine version if necessary.
> 
> I'd suppose one could pull the list of possible CPUs from QEMU started
> in preconfig mode with desired -smp x,y,z using QUERY_HOTPLUGGABLE_CPUS
> and then continue to configure numa with QMP commands using provided
> CPUs layout.

Continue where? At the 'preconfig mode' the guest is already started, 
isn't it? Are you suggesting that libvirt starts a dummy QEMU process, 
fetches the CPU topology from it an then starts if for real? Libvirt 
tries to avoid that as much as it can.

> 
> How to present it to libvirt user I'm not sure (give them that list perhaps
> and let select from it???)

This is what I am trying to figure out in the cover letter. Maybe we 
need to let users configure the topology (well, vCPU id to [socket, die, 
core, thread] mapping), but then again, in my testing the guest ignored 
that and displayed different topology (true, I was testing with -cpu 
host, so maybe that's why).

> But it's irrelevant, to the patch, magical IDs for socket/core/...whatever
> should not be generated by libvirt anymore, but rather taken from QEMU for given
> machine + -smp combination.

Taken when? We can do this for running machines, but not for freshly 
started ones, can we?

Michal




More information about the libvir-list mailing list