[libvirt] [PATCH v2 3/3] qemu: add memfd source type

John Ferlan jferlan at redhat.com
Tue Sep 18 21:02:29 UTC 2018



On 09/17/2018 09:14 AM, marcandre.lureau at redhat.com wrote:
> From: Marc-André Lureau <marcandre.lureau at redhat.com>
> 
> Add a new memoryBacking source type "memfd", supported by QEMU (when
> the apability is available).

*capability
> 
> A memfd is a specialized anonymous memory kind. As such, an anonymous
> source type could be automatically using a memfd. However, there are
> some complications when migrating from different memory backends in
> qemu (mainly due to the internal object naming at this point, but
> there could be more). For now, it is simpler and safer to simply
> introduce a new source type "memfd". Eventually, the "anonymous" type
> could learn to use memfd transparently in a seperate change.

*separate

> 
> The main benefits are that it doesn't need to create filesystem files,
> and it also enforces sealing, providing a bit more safety.
> 
> Signed-off-by: Marc-André Lureau <marcandre.lureau at redhat.com>
> ---
>  docs/formatdomain.html.in                     |  9 +--
>  docs/schemas/domaincommon.rng                 |  1 +
>  src/conf/domain_conf.c                        |  3 +-
>  src/conf/domain_conf.h                        |  1 +
>  src/qemu/qemu_command.c                       | 69 +++++++++++++------
>  src/qemu/qemu_domain.c                        | 12 +++-
>  .../memfd-memory-numa.x86_64-latest.args      | 34 +++++++++
>  tests/qemuxml2argvdata/memfd-memory-numa.xml  | 36 ++++++++++
>  tests/qemuxml2argvtest.c                      |  2 +
>  9 files changed, 140 insertions(+), 27 deletions(-)
>  create mode 100644 tests/qemuxml2argvdata/memfd-memory-numa.x86_64-latest.args
>  create mode 100644 tests/qemuxml2argvdata/memfd-memory-numa.xml
> 

More recently I've been trying to enforce separating XML/conf/rng/docs
changes from qemu/args changes... This makes review and testing a bit
easier and more "restricted".

Since I didn't make it clear previously and I can split things up - no
problem. I'll also be adding a "qemuxml2xmltest" for the input file to
"prove" it generates the output. It'll of course need to add the
QEMU_CAPS_OBJECT_MEMORY_MEMFD_HUGETLB to the DO_TEST.

Adding xml2xmltest is something required when we add new attributes or
input options.

I'll split the commit message appropriately too.

BTW: I think if "someone" follows this up with moving the qemu_command
logic into a new qemuDomainPrepare* method, then I think we can separate
the "new" or "fresh" start from the migration start and thus might be
able to generate a mechanism that would use memfd for anonymous with the
right capabilities present.  Not sure it'll fly, but it may be worth a
shot. It's getting more and more painful to be stuck with "old stuff".

> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> index 1f12ab5b42..eeee1f6d40 100644
> --- a/docs/formatdomain.html.in
> +++ b/docs/formatdomain.html.in
> @@ -1099,7 +1099,7 @@
>      </hugepages>
>      <nosharepages/>
>      <locked/>
> -    <source type="file|anonymous"/>
> +    <source type="file|anonymous|memfd"/>
>      <access mode="shared|private"/>
>      <allocation mode="immediate|ondemand"/>
>      <discard/>
> @@ -1150,9 +1150,10 @@
>          suitable for the specific environment at the same time to mitigate
>          the risks described above. <span class="since">Since 1.0.6</span></dd>
>         <dt><code>source</code></dt>
> -       <dd>Using the <code>type</code> attribute, it's possible to provide
> -         "file" to utilize file memorybacking or keep the default
> -         "anonymous".</dd>
> +       <dd>Using the <code>type</code> attribute, it's possible to
> +       provide "file" to utilize file memorybacking or keep the
> +       default "anonymous". <span class="since">Since 4.8.0</span>,
> +       you may choose "memfd" backing. (QEMU/KVM only)</dd>

Need to keep format consistent, I'll adjust.

>         <dt><code>access</code></dt>
>         <dd>Using the <code>mode</code> attribute, specify if the memory is
>           to be "shared" or "private". This can be overridden per numa node by
> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
> index 099a949cf8..4b431b4188 100644
> --- a/docs/schemas/domaincommon.rng
> +++ b/docs/schemas/domaincommon.rng
> @@ -655,6 +655,7 @@
>                    <choice>
>                      <value>file</value>
>                      <value>anonymous</value>
> +                    <value>memfd</value>
>                    </choice>
>                  </attribute>
>                </element>
> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index 1ee43950ae..648015b5b5 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
> @@ -894,7 +894,8 @@ VIR_ENUM_IMPL(virDomainDiskMirrorState, VIR_DOMAIN_DISK_MIRROR_STATE_LAST,
>  VIR_ENUM_IMPL(virDomainMemorySource, VIR_DOMAIN_MEMORY_SOURCE_LAST,
>                "none",
>                "file",
> -              "anonymous")
> +              "anonymous",
> +	      "memfd")

syntax-check would tell you thou shalt not use tabs

>  
>  VIR_ENUM_IMPL(virDomainMemoryAllocation, VIR_DOMAIN_MEMORY_ALLOCATION_LAST,
>                "none",

[...]

> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index 2fd8a2a268..4983669a34 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -3949,7 +3949,8 @@ qemuDomainDefValidateFeatures(const virDomainDef *def,
>  
>  
>  static int
> -qemuDomainDefValidateMemory(const virDomainDef *def)
> +qemuDomainDefValidateMemory(const virDomainDef *def,
> +                            virQEMUCapsPtr qemuCaps)
>  {
>      const long system_page_size = virGetSystemPageSizeKB();
>      const virDomainMemtune *mem = &def->mem;
> @@ -3971,6 +3972,13 @@ qemuDomainDefValidateMemory(const virDomainDef *def)
>          return -1;
>      }
>  
> +    if (mem->source == VIR_DOMAIN_MEMORY_SOURCE_MEMFD &&
> +        !virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_MEMFD_HUGETLB)) {
> +        virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> +                       _("hugepages is not support with memfd memory source"));

    _("hugepages are not supported using memfd memory "
      "source with this version of QEMU"));

> +        return -1;
> +    }
> +
>      /* We can't guarantee any other mem.access
>       * if no guest NUMA nodes are defined. */
>      if (mem->hugepages[0].size != system_page_size &&
> @@ -4110,7 +4118,7 @@ qemuDomainDefValidate(const virDomainDef *def,
>      if (qemuDomainDefValidateFeatures(def, qemuCaps) < 0)
>          goto cleanup;
>  
> -    if (qemuDomainDefValidateMemory(def) < 0)
> +    if (qemuDomainDefValidateMemory(def, qemuCaps) < 0)
>          goto cleanup;
>  
>      ret = 0;

[...]

> diff --git a/tests/qemuxml2argvdata/memfd-memory-numa.xml b/tests/qemuxml2argvdata/memfd-memory-numa.xml
> new file mode 100644
> index 0000000000..8416a990fa
> --- /dev/null
> +++ b/tests/qemuxml2argvdata/memfd-memory-numa.xml

I don't recall from the original change, but each of the lines is
prefixed by 2 extra spaces... I'll fix before pushing.

I can fixup the nits noted. I'll wait until tomorrow before pushing so
that if Michal or Pavel wish to comment they can...

Reviewed-by: John Ferlan <jferlan at redhat.com>

John

> @@ -0,0 +1,36 @@
> +  <domain type='kvm' id='56'>
> +    <name>instance-00000092</name>
> +    <uuid>126f2720-6f8e-45ab-a886-ec9277079a67</uuid>
> +    <memory unit='KiB'>14680064</memory>
> +    <currentMemory unit='KiB'>14680064</currentMemory>
> +    <memoryBacking>
> +      <hugepages>
> +          <page size="2" unit="M"/>
> +      </hugepages>
> +      <source type='memfd'/>
> +      <access mode='shared'/>
> +      <allocation mode='immediate'/>
> +    </memoryBacking>
> +    <numatune>
> +        <memnode cellid='0' mode='preferred' nodeset='3'/>
> +    </numatune>
> +    <vcpu placement='static'>8</vcpu>
> +    <os>
> +      <type arch='x86_64' machine='pc-i440fx-wily'>hvm</type>
> +      <boot dev='hd'/>
> +    </os>
> +    <cpu>
> +      <topology sockets='1' cores='8' threads='1'/>
> +      <numa>
> +        <cell id='0' cpus='0-7' memory='14680064' unit='KiB'/>
> +      </numa>
> +    </cpu>
> +    <clock offset='utc'/>
> +    <on_poweroff>destroy</on_poweroff>
> +    <on_reboot>restart</on_reboot>
> +    <on_crash>destroy</on_crash>
> +    <devices>
> +      <emulator>/usr/bin/qemu-system-x86_64</emulator>
> +      <memballoon model='virtio'/>
> +    </devices>
> +  </domain>

[...]




More information about the libvir-list mailing list