[PATCH] qemu: Relax memory pre-allocation rules

Michal Privoznik mprivozn at redhat.com
Mon Nov 30 10:06:14 UTC 2020


Currently, we configure QEMU to prealloc memory almost by
default. Well, by default for NVDIMMs, hugepages and if user
asked us to (via memoryBacking <allocation mode="immediate"/>).

However, there are two cases where this approach is not the best:

1) in case when guest's NVDIMM is backed by real life NVDIMM. In
   this case users should put <pmem/> into the <memory/> device
   <source/>, like this:

   <memory model='nvdimm' access='shared'>
     <source>
       <path>/dev/pmem0</path>
       <pmem/>
     </source>
   </memory>

   Instructing QEMU to do prealloc in this case means that each
   page of the NVDIMM is "touched" (the first byte is read and
   written back - see QEMU commit v2.9.0-rc1~26^2) which cripples
   device wear.

2) if free-page-reporting is turned on. While the
   free-page-reporting feature might not have a catchy or obvious
   name, when enabled it instructs KVM and subsequently QEMU to
   free pages no longer used by guest resulting in smaller memory
   footprint. And preallocating whole memory goes against this.

The BZ comment 11 mentions another, third case 'virtio-mem' but
that is not implemented in libvirt, yet.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1894053
Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
---
 src/qemu/qemu_command.c                               | 11 +++++++++--
 .../memory-hotplug-nvdimm-pmem.x86_64-latest.args     |  2 +-
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index 479bcc0b0c..3df8b5ac76 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -2977,7 +2977,11 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
     if (discard == VIR_TRISTATE_BOOL_ABSENT)
         discard = def->mem.discard;
 
-    if (def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE)
+    /* The whole point of free_page_reporting is that as soon as guest frees
+     * any memory it is freed in the host too. Prealloc doesn't make much sense
+     * then. */
+    if (def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE &&
+        def->memballoon->free_page_reporting != VIR_TRISTATE_SWITCH_ON)
         prealloc = true;
 
     if (virDomainNumatuneGetMode(def->numa, mem->targetNode, &mode) < 0 &&
@@ -3064,7 +3068,10 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
 
         if (mem->nvdimmPath) {
             memPath = g_strdup(mem->nvdimmPath);
-            prealloc = true;
+            /* If the NVDIMM is a real device then there's nothing to prealloc.
+             * If anyhing, we would be only wearing off the device. */
+            if (!mem->nvdimmPmem)
+                prealloc = true;
         } else if (useHugepage) {
             if (qemuGetDomainHupageMemPath(priv->driver, def, pagesize, &memPath) < 0)
                 return -1;
diff --git a/tests/qemuxml2argvdata/memory-hotplug-nvdimm-pmem.x86_64-latest.args b/tests/qemuxml2argvdata/memory-hotplug-nvdimm-pmem.x86_64-latest.args
index cac02a6f6d..fb4ae4b518 100644
--- a/tests/qemuxml2argvdata/memory-hotplug-nvdimm-pmem.x86_64-latest.args
+++ b/tests/qemuxml2argvdata/memory-hotplug-nvdimm-pmem.x86_64-latest.args
@@ -20,7 +20,7 @@ file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \
 -object memory-backend-ram,id=ram-node0,size=224395264 \
 -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 \
 -object memory-backend-file,id=memnvdimm0,mem-path=/tmp/nvdimm,share=no,\
-prealloc=yes,size=536870912,pmem=yes \
+size=536870912,pmem=yes \
 -device nvdimm,node=0,memdev=memnvdimm0,id=nvdimm0,slot=0 \
 -uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \
 -display none \
-- 
2.26.2




More information about the libvir-list mailing list