[libvirt] [PATCH 3/5] qemu: prefer memfd for anonymous memory

Marc-André Lureau marcandre.lureau at redhat.com
Tue Sep 11 07:53:57 UTC 2018


Hi

On Tue, Sep 11, 2018 at 2:46 AM, John Ferlan <jferlan at redhat.com> wrote:
>
> On 09/07/2018 07:32 AM, marcandre.lureau at redhat.com wrote:
>> From: Marc-André Lureau <marcandre.lureau at redhat.com>
>>
>
> Would be nice to have a few more words here. If you provide them I can
> add them... The if statement is difficult to read unless you know what
> each field really means.

hostmem-memfd is quite similar to hostmem-file. The main benefits are
that it doesn't need to create filesystem files, and it also enforces
sealing, providing a bit more safety.

> secondary question - should we document what gets used?, e.g.:
>
> https://libvirt.org/formatdomain.html#elementsMemoryBacking
>
> Seems to me the preference to use memfd is for memory backing using
> anonymous source for nvdimm's without a defined path, but sometimes my
> wording doesn't match reality.

Yes it could be documented. But it's now an allocation decision that
could evolve, or an implementation detail.

Would you like to see something like that?

        <dt><code>source</code></dt>
-       <dd>In this attribute you can switch to file memorybacking or keep
-         default anonymous.</dd>
+       <dd>In this attribute you can switch to file memorybacking or
+       keep default anonymous. <span class="since">Since 4.8.0</span>,
+       when the memory is anonymous and the host supports it, libvirt
+       will use a memfd memory backing, providing additional safety
+       guarantees.
+       </dd>
        <dt><code>access</code></dt>


>
>> Signed-off-by: Marc-André Lureau <marcandre.lureau at redhat.com>
>> ---
>>  src/qemu/qemu_command.c | 61 +++++++++++++++++++++++++++++------------
>>  1 file changed, 43 insertions(+), 18 deletions(-)
>>
>> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
>> index c9e3a91e32..97cfc8a18d 100644
>> --- a/src/qemu/qemu_command.c
>> +++ b/src/qemu/qemu_command.c
>> @@ -3113,6 +3113,24 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
>>      return ret;
>>  }
>>
>> +static int
>> +qemuBuildMemoryBackendPropsShare(virJSONValuePtr props,
>> +                                 virDomainMemoryAccess memAccess)
>> +{
>> +    switch (memAccess) {
>> +    case VIR_DOMAIN_MEMORY_ACCESS_SHARED:
>> +        return virJSONValueObjectAdd(props, "b:share", true, NULL);
>> +
>> +    case VIR_DOMAIN_MEMORY_ACCESS_PRIVATE:
>> +        return virJSONValueObjectAdd(props, "b:share", false, NULL);
>> +
>> +    case VIR_DOMAIN_MEMORY_ACCESS_DEFAULT:
>> +    case VIR_DOMAIN_MEMORY_ACCESS_LAST:
>> +        break;
>> +    }
>> +
>> +    return 0;
>> +}
>>
>>  /**
>>   * qemuBuildMemoryBackendProps:
>
> The comments should have been updated... In particular:
>
>   "Then, if one of the two memory-backend-* should be used..."
>
>> @@ -3259,7 +3277,23 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
>>      if (!(props = virJSONValueNewObject()))
>>          return -1;
>>
>> -    if (useHugepage || mem->nvdimmPath || memAccess ||
>
> Is this preference over "-ram" or "-file"?  It would seem to me someone
> choosing "file" has a specific case and this is more for those other
> options where if capabilities exist, then we try to use them.


(tbh, I don't know if you could have both nvdimmPath source ==
ANONYMOUS. That seems incompatible to me)

So the 'if' statement reads: if memory source is anonymous, and qemu
supports memfd (and if hugepage is requested and memfd supports it),
then let's use memfd, otherwise, keep/use the existing allocation
rules.


>> +    if (!mem->nvdimmPath &&
>> +        def->mem.source == VIR_DOMAIN_MEMORY_SOURCE_ANONYMOUS &&
>> +        virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_MEMFD) &&
>> +        (!useHugepage || virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_MEMFD_HUGETLB))) {
>> +        backendType = "memory-backend-memfd";
>> +
>> +        if (useHugepage &&
>> +            (virJSONValueObjectAdd(props, "b:hugetlb", useHugepage, NULL) < 0 ||
>> +             virJSONValueObjectAdd(props, "U:hugetlbsize", pagesize << 10, NULL) < 0)) {
>> +            goto cleanup;
>> +        }
>> +
>> +        if (qemuBuildMemoryBackendPropsShare(props, memAccess) < 0) {
>> +            goto cleanup;
>> +        }
>
> Running syntax-check would fail because of the { }

Sorry, I keep mixing with the QEMU coding style...

>
>> +
>> +    } else if (useHugepage || mem->nvdimmPath || memAccess ||
>>          def->mem.source == VIR_DOMAIN_MEMORY_SOURCE_FILE) {
>>
>>          if (mem->nvdimmPath) {
>> @@ -3297,20 +3331,8 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
>>                  goto cleanup;
>>          }
>>
>> -        switch (memAccess) {
>> -        case VIR_DOMAIN_MEMORY_ACCESS_SHARED:
>> -            if (virJSONValueObjectAdd(props, "b:share", true, NULL) < 0)
>> -                goto cleanup;
>> -            break;
>> -
>> -        case VIR_DOMAIN_MEMORY_ACCESS_PRIVATE:
>> -            if (virJSONValueObjectAdd(props, "b:share", false, NULL) < 0)
>> -                goto cleanup;
>> -            break;
>> -
>> -        case VIR_DOMAIN_MEMORY_ACCESS_DEFAULT:
>> -        case VIR_DOMAIN_MEMORY_ACCESS_LAST:
>> -            break;
>> +        if (qemuBuildMemoryBackendPropsShare(props, memAccess) < 0) {
>> +            goto cleanup;
>>          }
>
> Running syntax-check would fail here because of the { }
>
> All this is fix-able without you needing to post another series, but I
> need you to provide the verbiage for the intro and perhaps something
> that could be added to the web page. I can adjust the patch accordingly.
>
> Assuming of course Michal doesn't have other reservations...
>
> Reviewed-by: John Ferlan <jferlan at redhat.com>

If you already resolved patch 1 & 2 conflicts, I would appreciate if
you could take care. Otherwise I'll have to rebase & resend the
patches.

thanks

>
> John
>
>>      } else {
>>          backendType = "memory-backend-ram";
>> @@ -7609,7 +7631,8 @@ qemuBuildNumaArgStr(virQEMUDriverConfigPtr cfg,
>>
>>      if (virDomainNumatuneHasPerNodeBinding(def->numa) &&
>>          !(virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_RAM) ||
>> -          virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE))) {
>> +          virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE) ||
>> +          virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_MEMFD))) {
>>          virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
>>                         _("Per-node memory binding is not supported "
>>                           "with this QEMU"));
>> @@ -7618,7 +7641,8 @@ qemuBuildNumaArgStr(virQEMUDriverConfigPtr cfg,
>>
>>      if (def->mem.nhugepages &&
>>          def->mem.hugepages[0].size != system_page_size &&
>> -        !virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE)) {
>> +        !(virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE) ||
>> +          virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_MEMFD_HUGETLB))) {
>>          virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
>>                         _("huge pages per NUMA node are not "
>>                           "supported with this QEMU"));
>> @@ -7635,7 +7659,8 @@ qemuBuildNumaArgStr(virQEMUDriverConfigPtr cfg,
>>       * need to check which approach to use */
>>      for (i = 0; i < ncells; i++) {
>>          if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_RAM) ||
>> -            virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE)) {
>> +            virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_FILE) ||
>> +            virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_MEMFD)) {
>>
>>              if ((rc = qemuBuildMemoryCellBackendStr(def, cfg, i, priv,
>>                                                      &nodeBackends[i])) < 0)
>>




More information about the libvir-list mailing list