[libvirt] [PATCH] qemu: Don't use -mem-prealloc among with .prealloc=yes

Martin Kletzander mkletzan at redhat.com
Wed Nov 7 13:16:18 UTC 2018


On Wed, Nov 07, 2018 at 10:47:01AM +0100, Michal Privoznik wrote:
>On 11/07/2018 12:43 AM, John Ferlan wrote:
>>
>>
>> On 11/5/18 9:49 AM, Michal Privoznik wrote:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1624223
>>>
>>> There are two ways to request memory preallocation on cmd line:
>>> -mem-prealloc and .prealloc attribute to memory-backend-file.
>>
>> s/to/for a/ ?
>>
>>> However, as it turns out it's not safe to use both at the same
>>> time. Prefer -mem-prealloc as it is more backward compatible
>>> compared to switching to "-numa node,memdev=  + -object
>>> memory-backend-file".
>>>
>>
>> FWIW: Issue introduced by commit 1c4f3b56..
>>
>> While I understand the reasoning, it's really too bad we couldn't "move"
>> the determination over which conflicting qualifier is used to earlier.
>> By the time we call the -numa backend we would already have had to make
>> the choice if I'm reading the ordering right.
>
>Correct, you're reading it right.
>
>>
>> But if it doesn't matter for the -numa object to use the -mem-prealloc,
>> then who am I to complain.  Of course the "future thinking" me that is
>> living in the present issues surrounding machine and pc makes me wonder
>> if choosing this as the default going forward into the future where
>> someone could deprecate the -mem-prealloc because -numa will be so
>> prevelant won't bite us down the road.
>
>If -mem-prealloc is deprecated then we would have to construct -object
>memory-backend-file. I'm not against this, but IIRC this fails during
>migration. I mean, if you have a guest that uses -mem-path you can't
>migrate it to -object memory-backing-file because qemu would fail to
>load the migration stream. That is why we have @needBackend in
>qemuBuildNumaArgStr(), so that new cmd line is built iff really needed.
>
>This is the reason I went this way even though BZ suggests otherwise.
>
>>
>> Curious how others feel - I'm not against this choice, just trying to
>> supply an opposing/differing viewpoint. We really have to start coding
>> for the future and consider what deprecation could mean especially for
>> arguments that essentially mean the same thing.
>>
>>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>>> ---
>>>  src/qemu/qemu_command.c                       | 37 +++++++++++++------
>>>  src/qemu/qemu_command.h                       |  1 +
>>>  src/qemu/qemu_domain.c                        |  2 +
>>>  src/qemu/qemu_domain.h                        |  3 ++
>>>  src/qemu/qemu_hotplug.c                       |  3 +-
>>>  .../hugepages-numa-default-dimm.args          |  2 +-
>>>  6 files changed, 35 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
>>> index e338d3172e..0294030f0e 100644
>>> --- a/src/qemu/qemu_command.c
>>> +++ b/src/qemu/qemu_command.c
>>> @@ -3123,6 +3123,7 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
>>>   * @def: domain definition object
>>>   * @mem: memory definition object
>>>   * @autoNodeset: fallback nodeset in case of automatic NUMA placement
>>> + * @forbidPrealloc: don't set prealloc attribute
>>
>> Slight bikeshed, but this changes the priv->memAlloc to @forbidPrealloc
>> which is IMO a bit odd.
>
>Okay, what name do you suggest? My reasoning for the name was that it
>should make sense from the function POV. That's why calling the variable
>'memAlloc' did not make sense to me.
>
>>
>> Beyond that, this becomes the 3rd @priv field to be passed along...
>> Maybe @priv should just be passed to access qemuCaps, autoNodeset, and
>> memPrealloc.
>
>Ah sure.
>
>>
>>>   * @force: forcibly use one of the backends
>>>   *
>>>   * Creates a configuration object that represents memory backend of given guest
>>> @@ -3136,6 +3137,9 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
>>>   * Then, if one of the two memory-backend-* should be used, the @qemuCaps is
>>>   * consulted to check if qemu does support it.
>>>   *
>>> + * If @forbidPrealloc is true then 'prealloc' attribute of the backend is not
>>> + * set. This may come handy when global -mem-prealloc is already specified.
>>> + *
>>>   * Returns: 0 on success,
>>>   *          1 on success and if there's no need to use memory-backend-*
>>>   *         -1 on error.
>>> @@ -3148,6 +3152,7 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
>>>                              virDomainDefPtr def,
>>>                              virDomainMemoryDefPtr mem,
>>>                              virBitmapPtr autoNodeset,
>>> +                            bool forbidPrealloc,
>>>                              bool force)
>>>  {
>>>      const char *backendType = "memory-backend-file";
>>> @@ -3265,11 +3270,13 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
>>>          if (mem->nvdimmPath) {
>>>              if (VIR_STRDUP(memPath, mem->nvdimmPath) < 0)
>>>                  goto cleanup;
>>> -            prealloc = true;
>>> +            if (!forbidPrealloc)
>>> +                prealloc = true;
>>>          } else if (useHugepage) {
>>>              if (qemuGetDomainHupageMemPath(def, cfg, pagesize, &memPath) < 0)
>>>                  goto cleanup;
>>> -            prealloc = true;
>>> +            if (!forbidPrealloc)
>>> +                prealloc = true;
>>>          } else {
>>>              /* We can have both pagesize and mem source. If that's the case,
>>>               * prefer hugepages as those are more specific. */
>>> @@ -3398,7 +3405,8 @@ qemuBuildMemoryCellBackendStr(virDomainDefPtr def,
>>>      mem.info.alias = alias;
>>>
>>>      if ((rc = qemuBuildMemoryBackendProps(&props, alias, cfg, priv->qemuCaps,
>>> -                                          def, &mem, priv->autoNodeset, false)) < 0)
>>> +                                          def, &mem, priv->autoNodeset,
>>> +                                          priv->memPrealloc, false)) < 0)
>>>          goto cleanup;
>>>
>>>      if (virQEMUBuildObjectCommandlineFromJSON(buf, props) < 0)
>>> @@ -3435,7 +3443,8 @@ qemuBuildMemoryDimmBackendStr(virBufferPtr buf,
>>>          goto cleanup;
>>>
>>>      if (qemuBuildMemoryBackendProps(&props, alias, cfg, priv->qemuCaps,
>>> -                                    def, mem, priv->autoNodeset, true) < 0)
>>> +                                    def, mem, priv->autoNodeset,
>>> +                                    priv->memPrealloc, true) < 0)
>>>          goto cleanup;
>>>
>>>      if (virQEMUBuildObjectCommandlineFromJSON(buf, props) < 0)
>>> @@ -7443,7 +7452,8 @@ qemuBuildSmpCommandLine(virCommandPtr cmd,
>>>  static int
>>>  qemuBuildMemPathStr(virQEMUDriverConfigPtr cfg,
>>>                      const virDomainDef *def,
>>> -                    virCommandPtr cmd)
>>> +                    virCommandPtr cmd,
>>> +                    qemuDomainObjPrivatePtr priv)
>>>  {
>>>      const long system_page_size = virGetSystemPageSizeKB();
>>>      char *mem_path = NULL;
>>> @@ -7465,8 +7475,10 @@ qemuBuildMemPathStr(virQEMUDriverConfigPtr cfg,
>>>          return 0;
>>>      }
>>>
>>> -    if (def->mem.allocation != VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE)
>>> +    if (def->mem.allocation != VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE) {
>>>          virCommandAddArgList(cmd, "-mem-prealloc", NULL);
>>> +        priv->memPrealloc = true;
>>> +    }
>>>
>>>      virCommandAddArgList(cmd, "-mem-path", mem_path, NULL);
>>>      VIR_FREE(mem_path);
>>> @@ -7479,7 +7491,8 @@ static int
>>>  qemuBuildMemCommandLine(virCommandPtr cmd,
>>>                          virQEMUDriverConfigPtr cfg,
>>>                          const virDomainDef *def,
>>> -                        virQEMUCapsPtr qemuCaps)
>>> +                        virQEMUCapsPtr qemuCaps,
>>> +                        qemuDomainObjPrivatePtr priv)
>>>  {
>>>      if (qemuDomainDefValidateMemoryHotplug(def, qemuCaps, NULL) < 0)
>>>          return -1;
>>> @@ -7498,15 +7511,17 @@ qemuBuildMemCommandLine(virCommandPtr cmd,
>>>                                virDomainDefGetMemoryInitial(def) / 1024);
>>>      }
>>>
>>> -    if (def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE)
>>> +    if (def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE) {
>>>          virCommandAddArgList(cmd, "-mem-prealloc", NULL);
>>> +        priv->memPrealloc = true;
>>> +    }
>>
>> I find it "confusing" that setting memPrealloc = true when
>> "def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE";
>> however, in qemuBuildMemPathStr it's a != comparison.
>>
>> I know it's existing, but strange.
>
>This is so that -mem-prealloc is not added twice onto the cmd line. The
>first addition is done here and the second is done possibly in
>qemuBuildMemPathStr ..
>

Yeah, the memory code is... I have no words for that.  The problem is there is
not much re-factorings that would work in all freaking corner cases that we
(unfortunately) decided to cover.

It's as Michal says, on this occasion the mempath should be added here
unconditionally, but the condition is there to make it not appear twice.  Also
it needs to be added there before the memory objects so setting a variable and
then acting upon that is not the right refactor (I hope I remember that
correctly).

If you put it in the caller, then you get to the point it's not the only one...
I don't remember how the second caller handles the prealloc, but I remember we
tried bunch of refactors and there's not much to do until we deprecate some QEMU
versions.

>>
>> Again, I'm not against this, but would like to see if someone with more
>> numa experience chimes in (Martin?) and whether we need to think more in
>> terms of what deprecation could mean.
>
>It would mean inability to migrate to newer libvirt.
>

Well, since we deprecated some QEMU versions (finally), we can just add a flag
in the migration cookie that will tell us if the current libvirt version is
preferring the .prealloc and just start using that for newly started VMs.
Migrating to older version won't work, but that's not supported.  Unless
exceptions of course, but anyone can handle that by backporting the support for
the flag if that's your (or your distro's vendor's) need.

TL;DR: The pre-existing condition is actually fine, unfortunately.

>>
>> John
>>
>>>
>>>      /*
>>>       * Add '-mem-path' (and '-mem-prealloc') parameter here if
>>>       * the hugepages and no numa node is specified.
>>>       */
>>>      if (!virDomainNumaGetNodeCount(def->numa) &&
>>> -        qemuBuildMemPathStr(cfg, def, cmd) < 0)
>>> +        qemuBuildMemPathStr(cfg, def, cmd, priv) < 0)
>
>.. called here.
>
>Michal
>
>--
>libvir-list mailing list
>libvir-list at redhat.com
>https://www.redhat.com/mailman/listinfo/libvir-list
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20181107/e657d2b9/attachment-0001.sig>


More information about the libvir-list mailing list