[PATCH 4/5] qemu: Prefer -numa cpu over -numa node,cpus=
Michal Privoznik
mprivozn at redhat.com
Fri May 22 16:28:31 UTC 2020
On 5/22/20 6:07 PM, Igor Mammedov wrote:
> On Fri, 22 May 2020 16:14:14 +0200
> Michal Privoznik <mprivozn at redhat.com> wrote:
>
>> QEMU is trying to obsolete -numa node,cpus= because that uses
>> ambiguous vCPU id to [socket, die, core, thread] mapping. The new
>> form is:
>>
>> -numa cpu,node-id=N,socket-id=S,die-id=D,core-id=C,thread-id=T
>>
>> which is repeated for every vCPU and places it at [S, D, C, T]
>> into guest NUMA node N.
>>
>> While in general this is magic mapping, we can deal with it.
>> Firstly, with QEMU 2.7 or newer, libvirt ensures that if topology
>> is given then maxvcpus must be sockets * dies * cores * threads
>> (i.e. there are no 'holes').
>> Secondly, if no topology is given then libvirt itself places each
>> vCPU into a different socket (basically, it fakes topology of:
>> [maxvcpus, 1, 1, 1])
>> Thirdly, we can copy whatever QEMU is doing when mapping vCPUs
>> onto topology, to make sure vCPUs don't start to move around.
>>
>> Note, migration from old to new cmd line works and therefore
>> doesn't need any special handling.
>>
>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1678085
>>
>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>> ---
>> src/qemu/qemu_command.c | 108 +++++++++++++++++-
>> .../hugepages-nvdimm.x86_64-latest.args | 4 +-
>> ...memory-default-hugepage.x86_64-latest.args | 10 +-
>> .../memfd-memory-numa.x86_64-latest.args | 10 +-
>> ...y-hotplug-nvdimm-access.x86_64-latest.args | 4 +-
>> ...ry-hotplug-nvdimm-align.x86_64-latest.args | 4 +-
>> ...ry-hotplug-nvdimm-label.x86_64-latest.args | 4 +-
>> ...ory-hotplug-nvdimm-pmem.x86_64-latest.args | 4 +-
>> ...ory-hotplug-nvdimm-ppc64.ppc64-latest.args | 4 +-
>> ...hotplug-nvdimm-readonly.x86_64-latest.args | 4 +-
>> .../memory-hotplug-nvdimm.x86_64-latest.args | 4 +-
>> ...vhost-user-fs-fd-memory.x86_64-latest.args | 4 +-
>> ...vhost-user-fs-hugepages.x86_64-latest.args | 4 +-
>> ...host-user-gpu-secondary.x86_64-latest.args | 3 +-
>> .../vhost-user-vga.x86_64-latest.args | 3 +-
>> 15 files changed, 158 insertions(+), 16 deletions(-)
>>
>> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
>> index 7d84fd8b5e..0de4fe4905 100644
>> --- a/src/qemu/qemu_command.c
>> +++ b/src/qemu/qemu_command.c
>> @@ -7079,6 +7079,91 @@ qemuBuildNumaOldCPUs(virBufferPtr buf,
>> }
>>
>>
>> +/**
>> + * qemuTranlsatevCPUID:
>> + *
>> + * For given vCPU @id and vCPU topology (@cpu) compute corresponding
>> + * @socket, @die, @core and @thread). This assumes linear topology,
>> + * that is every [socket, die, core, thread] combination is valid vCPU
>> + * ID and there are no 'holes'. This is ensured by
>> + * qemuValidateDomainDef() if QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS is
>> + * set.
> I wouldn't make this assumption, each machine can have (and has) it's own layout,
> and now it's not hard to change that per machine version if necessary.
>
> I'd suppose one could pull the list of possible CPUs from QEMU started
> in preconfig mode with desired -smp x,y,z using QUERY_HOTPLUGGABLE_CPUS
> and then continue to configure numa with QMP commands using provided
> CPUs layout.
Continue where? At the 'preconfig mode' the guest is already started,
isn't it? Are you suggesting that libvirt starts a dummy QEMU process,
fetches the CPU topology from it an then starts if for real? Libvirt
tries to avoid that as much as it can.
>
> How to present it to libvirt user I'm not sure (give them that list perhaps
> and let select from it???)
This is what I am trying to figure out in the cover letter. Maybe we
need to let users configure the topology (well, vCPU id to [socket, die,
core, thread] mapping), but then again, in my testing the guest ignored
that and displayed different topology (true, I was testing with -cpu
host, so maybe that's why).
> But it's irrelevant, to the patch, magical IDs for socket/core/...whatever
> should not be generated by libvirt anymore, but rather taken from QEMU for given
> machine + -smp combination.
Taken when? We can do this for running machines, but not for freshly
started ones, can we?
Michal
More information about the libvir-list
mailing list