[libvirt] [PATCH] qemu: don't setup cpuset.mems if memory mode in numatune is 'preferred'
Martin Kletzander
mkletzan at redhat.com
Fri Nov 7 11:18:45 UTC 2014
On Fri, Nov 07, 2014 at 05:36:43PM +0800, Wang Rui wrote:
>On 2014/11/5 16:07, Martin Kletzander wrote:
>[...]
>>>>> diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c
>>>>> index b5bdb36..8685d6f 100644
>>>>> --- a/src/qemu/qemu_cgroup.c
>>>>> +++ b/src/qemu/qemu_cgroup.c
>>>>> @@ -618,6 +618,11 @@ qemuSetupCpusetMems(virDomainObjPtr vm,
>>>>> if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET))
>>>>> return 0;
>>>>>
>>>>> + if (virDomainNumatuneGetMode(vm->def->numatune, -1) !=
>>>>> + VIR_DOMAIN_NUMATUNE_MEM_STRICT) {
>>>>> + return 0;
>>>>> + }
>>>>> +
>>>>
>>>> One question, is it problem only for 'preferred' or 'interleaved' as
>>>> well? Because if it's only problem for 'preferred', then the check is
>>>> wrong. If it's problem for 'interleaved' as well, then the commit
>>>> message is wrong.
>>>>
>>> 'interleave' with a single node(such as nodeset='0') will cause the same error.
>>> But 'interleave' mode should not live with a single node. So maybe there's
>>> another bugfix to check 'interleave' with single node.
>>>
>>
>> Well, I'd be OK with just changing the commit message to mention that.
>> This fix is still a valid one and will fix both issues, won't it?
>>
>>> If configured with 'interleave' and multiple nodes(such as nodeset='0-1'),
>>> VM can be started successfully. And cpuset.mems is set to the same nodeset.
>>> So I'll revise my patch.
>>>
>>> I'll send patches V2. Conclusion:
>>>
>>> 1/3 : add check for 'interleave' mode with single numa node
>>> 2/3 : fix this problem in qemu
>>> 3/3 : fix this problem in lxc
>>>
>>> Is it OK?
>>>
>>>> Anyway, after either one is fixed, I can push this.
>>>>
>
>I tested this problem again and found that this error occurred with each
>memory mode. It is broke by commit 411cea638f6ec8503b7142a31e58b1cd85dbeaba
>which is produced by me.
> qemu: move setting emulatorpin ahead of monitor showing up
>
>I'm sorry for that.
>
>That patch moved qemuSetupCgroupForEmulator before qemuSetupCgroupPostInit.
>
>I have ideas to fix that.
>
>1. Move qemuSetupCgroupPostInit ahead of monitor showing up, too.
> Of course it's before qemuSetupCgroupForEmulator.
> This action to fix the bug which is introduced by me.
> (RFC)
>
That cannot be done, IIRC, because we need monitor to get the
vCPU <-> thread mapping from it.
>2. Anyway the first problem is fixed, I have found the second problem which
> is I wanted to fix originally. If memory mode is 'preferred' and with
> one node (such as nodeset='0'), domain's memory is not in node 0
> absolutely. Assumption that node 0 doesn't have enough memory, memory
> can be allocated on node 1. Then if we set cpuset.mems to '0', it may
> cause OOM.
> The solution is checking memory mode in (lxc)qemuSetupCpusetMems as my
> patch on Tuesday. Such as
>
> + if (virDomainNumatuneGetMode(vm->def->numatune, -1) !=
> + VIR_DOMAIN_NUMATUNE_MEM_PREFERRED) {
>
Either this (as it makes sense to restrict qemu even for 'interleave'
or the previous check is fine too (just because that was what we did
before, I just rewrote it with few problems.
>BTW:
>3. After the first problem has been fixed, we can start domains with xml:
> <numatune>
> <memory mode='interleave' nodeset='0'/>
> </numatune>
>
> Is a single node '0' valid for 'interleave' ? I take 'interleave' as
> 'at least two nodes'.
>
Well, interleave of 1 node is effectively 'strict', isn't it? What
errors do you get if you try that? (my kernel stopped accepting
numa=fake=2 as a cmdline parameter :( )
Anyway, I think the best way would be mimicking the old behaviour by
just adding your first proposed fix "if (mode != STRICT) return 0",
just fit the fixed up comit message.
Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20141107/1d597826/attachment-0001.sig>
More information about the libvir-list
mailing list