[libvirt] [RFC PATCH 01/17] qemu: setup shared memory without explicit numa configuration

John Ferlan jferlan at redhat.com
Thu Aug 16 11:32:57 UTC 2018



On 08/16/2018 06:31 AM, Marc-André Lureau wrote:
> Hi
> 
> On Thu, Aug 16, 2018 at 4:35 AM, John Ferlan <jferlan at redhat.com> wrote:
>>
>>
>> On 07/13/2018 09:28 AM, marcandre.lureau at redhat.com wrote:
>>> From: Marc-André Lureau <marcandre.lureau at redhat.com>
>>>
>>> When a domain is configured with 'shared' memory backing:
>>>
>>>   <memoryBacking>
>>>     <access mode='shared'/>
>>>   </memoryBacking>
>>>
>>> But no explicit NUMA configuration, let's configure a shared memory
>>> backend associated with default -numa.
>>>
>>> Signed-off-by: Marc-André Lureau <marcandre.lureau at redhat.com>
>>> ---
>>>  src/qemu/qemu_command.c                       | 100 ++++++++++++------
>>>  .../fd-memory-no-numa-topology.args           |   4 +
>>>  2 files changed, 73 insertions(+), 31 deletions(-)
>>>
>>
>> NUMA, memory backends, and hugepages - not in my wheelhouse of
>> knowledge. Hopefully Michal and/or Pavel will take a look!
>>
>> Is it possible someone may not want this type of thing to happen? Is
> 
> I assume someone that sets 'shared' memory mode may consider this as a bug fix.
> 
>> there an upside or downside to this?  What happens "today" when not
> 
> You get non-shared memory
> 

So today someone asks for "shared" and then end up with "non-shared"? I
don't think that's apparent from the "access" description in:

https://libvirt.org/formatdomain.html#elementsMemoryBacking

Then again, not in my wheelhouse of knowledge, so maybe that's just one
of those givens. Of course that perhaps goes to your first answer of
this being a "bug fix". Not something that's apparent from the existing
documentation or commit description though. This probably should have
been it's own separate patch and not included in this series.

>> generated? And of course, what about migration concerns about
>> unconditionally doing this for some target migration?
> 
> True, this will break migration though if the target uses
> numa/memory-backend-file.
> 
> What do you suggest?
> 

This needs to be configurable as we cannot break w/ migration. I'm not
quite sure how we document this other than as Dan suggests that if
someone wants NUMA, then they'll configure things properly. If this
'shared' ends up as 'non-shared' because NUMA isn't configured, then we
should indicate that. If the adding some property to the element to
indicate desire of 'shared' via usage of some (local) backing store
which means the guest probably isn't very migrate-able, then so bit it.

>>
>>> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
>>> index 44ae8dcef7..f1235099b2 100644
>>> --- a/src/qemu/qemu_command.c
>>> +++ b/src/qemu/qemu_command.c
>>> @@ -3254,26 +3254,21 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
>>>
>>>
>>>  static int
>>> -qemuBuildMemoryCellBackendStr(virDomainDefPtr def,
>>> -                              virQEMUDriverConfigPtr cfg,
>>> -                              size_t cell,
>>> -                              qemuDomainObjPrivatePtr priv,
>>> -                              virBufferPtr buf)
>>> +qemuBuildMemoryBackendStr(virDomainDefPtr def,
>>> +                          virQEMUDriverConfigPtr cfg,
>>> +                          const char *alias,
>>
>> So the one "concern" I'd have here is some time in the future the @mem
>> gets allocated and handled like a real device eventually calling
>> virDomainDeviceInfoClear and that'd be a problem for the passed const
>> char * string.  Some future person's problem I guess!
>>
>>> +                          int targetNode,
>>> +                          unsigned long long memsize,
>>> +                          qemuDomainObjPrivatePtr priv,
>>> +                          virBufferPtr buf)
>>
>> As much as a long name is a pain, is this more of a :
>>
>>     qemuBuildMemorySharedDefaultBackendStr
> 
> Why?
> 

nm, I think when first reading for some reason I had this "separate"
from the CellBackend call.

[...]

>>> +            implicit = true;
>>> +        } else {
>>> +            ret = 0;
>>> +            goto cleanup;
>>> +        }
>>> +    }
>>
>> So if ncells == 0 && def->mem.access != VIR_DOMAIN_MEMORY_ACCESS_SHARED,
>> then we return 0 without doing the subsequent code? Is that expected? Is
>> there something done later that may be necessary, needed, or assumed.
> 
> No, before the patch, virDomainNumaGetNodeCount() is checked before
> calling qemuBuildNumaArgStr(). Now it is handled inside in case
> ncells==0 && def->mem.access != VIR_DOMAIN_MEMORY_ACCESS_SHARED.
> 

Ah, yes - I missed the check when looking at the changes

-    if (virDomainNumaGetNodeCount(def->numa) &&
-        qemuBuildNumaArgStr(cfg, def, cmd, priv) < 0)

[...]

>>> diff --git a/tests/qemuxml2argvdata/fd-memory-no-numa-topology.args b/tests/qemuxml2argvdata/fd-memory-no-numa-topology.args
>>> index bd88daaa3b..400fb39cc6 100644
>>> --- a/tests/qemuxml2argvdata/fd-memory-no-numa-topology.args
>>> +++ b/tests/qemuxml2argvdata/fd-memory-no-numa-topology.args
>>> @@ -11,6 +11,10 @@ QEMU_AUDIO_DRV=none \
>>>  -m 14336 \
>>>  -mem-prealloc \
>>>  -smp 8,sockets=8,cores=1,threads=1 \
>>> +-object memory-backend-file,id=ram-node,\
>>> +mem-path=/var/lib/libvirt/qemu/ram/libvirt/qemu/-1-instance-00000092/ram-node,\
>>> +share=yes,size=15032385536 \
>>
>> Curious does that perhaps rather large file for mem-path get created? If
>> so, wow!  Seems to me something like this would need to be documented or
>> as noted earlier be configurable.
> 
> You mean during tests? no, it's created by qemu afaik.
> 
> I am not a libvirt expert either, and I would rather have no file be
> created when you want shared memory. That's what the next patches
> propose with memfd support.
> 

So my point/question was someone may not realize (or want) such a large
file to be created by default for them in /var/lib/libvirt...  As noted
above maybe this becomes an "extra" attribute to request a non NUMA
backed shared file with an additional element path that would provide
the path to the file rather than the defualt /var/lib/libvirt...

John

(hopefully this all makes sense - I'm only one coffee in so far)

>>
>> John
>>
>>> +-numa node,nodeid=0,memdev=ram-node \
>>>  -uuid 126f2720-6f8e-45ab-a886-ec9277079a67 \
>>>  -display none \
>>>  -no-user-config \
>>>
> 
> thanks
> 




More information about the libvir-list mailing list