[libvirt] [PATCH v2 2/2] qemu: Set up EMULATOR thread and cpuset.mems before exec()-ing qemu

Michal Privoznik mprivozn at redhat.com
Mon Apr 15 16:32:32 UTC 2019


On 4/15/19 3:47 PM, Martin Kletzander wrote:
> On Wed, Apr 10, 2019 at 06:10:44PM +0200, Michal Privoznik wrote:
>> It's funny how this went unnoticed for such a long time. Long
>> story short, if a domain is configured with
>> VIR_DOMAIN_NUMATUNE_MEM_STRICT libvirt doesn't really honour
>> that. This is because of 7e72ac787848 after which libvirt allowed
>> qemu to allocate memory just anywhere and only after that it used
>> some magic involving cpuset.memory_migrate and cpuset.mems to
>> move the memory to desired NUMA nodes. This was done in order to
>> work around some KVM bug where KVM would fail if there wasn't a
>> DMA zone available on the NUMA node. Well, while the work around
>> might stopped libvirt tickling the KVM bug it also caused a bug
>> on libvirt side: if there is not enough memory on configured NUMA
>> node(s) then any attempt to start a domain must fail. Because of
>> the way we play with guest memory domains can start just happily.
>>
>> The solution is to move the child we've just forked into emulator
>> cgroup, set up cpuset.mems and exec() qemu only after that.
>>
> 
> So you are saying this was a bug in KVM?  Is it fixed now?  I am not 
> against
> this patch, I hated that I had to do the workaround, but I just want to 
> be sure
> we won't start hitting that again.

Yes, that's what I'm saying. Looks like the KVM bug is fixed now because 
with a Fedora 29 on a NUMA machine I can start domains just fine.

> 
>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>> ---
>> src/qemu/qemu_process.c | 12 ++++++++----
>> 1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>> index 47d8ca2ff1..076ec18e21 100644
>> --- a/src/qemu/qemu_process.c
>> +++ b/src/qemu/qemu_process.c
>> @@ -6653,6 +6653,14 @@ qemuProcessLaunch(virConnectPtr conn,
>>     if (qemuProcessInitCpuAffinity(vm) < 0)
>>         goto cleanup;
>>
>> +    VIR_DEBUG("Setting emulator tuning/settings");
>> +    if (qemuProcessSetupEmulator(vm) < 0)
>> +        goto cleanup;
>> +
>> +    VIR_DEBUG("Setting up post-init cgroup restrictions");
> 
> This is not post-init any more, but more importantly,
> 
>> +    if (qemuSetupCpusetMems(vm) < 0)
> 
> This function does a subset of what qemuProcessSetupEmulator() called right
> before, does, so I see no reason for it being called here, or to keep 
> existing
> in the codebase for that matter.

Ah, good point. I'll send v3.

Michal




More information about the libvir-list mailing list