[libvirt] [PATCH v2 3/3] qemu: add memfd source type
John Ferlan
jferlan at redhat.com
Tue Sep 18 21:02:29 UTC 2018
On 09/17/2018 09:14 AM, marcandre.lureau at redhat.com wrote:
> From: Marc-André Lureau <marcandre.lureau at redhat.com>
>
> Add a new memoryBacking source type "memfd", supported by QEMU (when
> the apability is available).
*capability
>
> A memfd is a specialized anonymous memory kind. As such, an anonymous
> source type could be automatically using a memfd. However, there are
> some complications when migrating from different memory backends in
> qemu (mainly due to the internal object naming at this point, but
> there could be more). For now, it is simpler and safer to simply
> introduce a new source type "memfd". Eventually, the "anonymous" type
> could learn to use memfd transparently in a seperate change.
*separate
>
> The main benefits are that it doesn't need to create filesystem files,
> and it also enforces sealing, providing a bit more safety.
>
> Signed-off-by: Marc-André Lureau <marcandre.lureau at redhat.com>
> ---
> docs/formatdomain.html.in | 9 +--
> docs/schemas/domaincommon.rng | 1 +
> src/conf/domain_conf.c | 3 +-
> src/conf/domain_conf.h | 1 +
> src/qemu/qemu_command.c | 69 +++++++++++++------
> src/qemu/qemu_domain.c | 12 +++-
> .../memfd-memory-numa.x86_64-latest.args | 34 +++++++++
> tests/qemuxml2argvdata/memfd-memory-numa.xml | 36 ++++++++++
> tests/qemuxml2argvtest.c | 2 +
> 9 files changed, 140 insertions(+), 27 deletions(-)
> create mode 100644 tests/qemuxml2argvdata/memfd-memory-numa.x86_64-latest.args
> create mode 100644 tests/qemuxml2argvdata/memfd-memory-numa.xml
>
More recently I've been trying to enforce separating XML/conf/rng/docs
changes from qemu/args changes... This makes review and testing a bit
easier and more "restricted".
Since I didn't make it clear previously and I can split things up - no
problem. I'll also be adding a "qemuxml2xmltest" for the input file to
"prove" it generates the output. It'll of course need to add the
QEMU_CAPS_OBJECT_MEMORY_MEMFD_HUGETLB to the DO_TEST.
Adding xml2xmltest is something required when we add new attributes or
input options.
I'll split the commit message appropriately too.
BTW: I think if "someone" follows this up with moving the qemu_command
logic into a new qemuDomainPrepare* method, then I think we can separate
the "new" or "fresh" start from the migration start and thus might be
able to generate a mechanism that would use memfd for anonymous with the
right capabilities present. Not sure it'll fly, but it may be worth a
shot. It's getting more and more painful to be stuck with "old stuff".
> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> index 1f12ab5b42..eeee1f6d40 100644
> --- a/docs/formatdomain.html.in
> +++ b/docs/formatdomain.html.in
> @@ -1099,7 +1099,7 @@
> </hugepages>
> <nosharepages/>
> <locked/>
> - <source type="file|anonymous"/>
> + <source type="file|anonymous|memfd"/>
> <access mode="shared|private"/>
> <allocation mode="immediate|ondemand"/>
> <discard/>
> @@ -1150,9 +1150,10 @@
> suitable for the specific environment at the same time to mitigate
> the risks described above. <span class="since">Since 1.0.6</span></dd>
> <dt><code>source</code></dt>
> - <dd>Using the <code>type</code> attribute, it's possible to provide
> - "file" to utilize file memorybacking or keep the default
> - "anonymous".</dd>
> + <dd>Using the <code>type</code> attribute, it's possible to
> + provide "file" to utilize file memorybacking or keep the
> + default "anonymous". <span class="since">Since 4.8.0</span>,
> + you may choose "memfd" backing. (QEMU/KVM only)</dd>
Need to keep format consistent, I'll adjust.
> <dt><code>access</code></dt>
> <dd>Using the <code>mode</code> attribute, specify if the memory is
> to be "shared" or "private". This can be overridden per numa node by
> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
> index 099a949cf8..4b431b4188 100644
> --- a/docs/schemas/domaincommon.rng
> +++ b/docs/schemas/domaincommon.rng
> @@ -655,6 +655,7 @@
> <choice>
> <value>file</value>
> <value>anonymous</value>
> + <value>memfd</value>
> </choice>
> </attribute>
> </element>
> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index 1ee43950ae..648015b5b5 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
> @@ -894,7 +894,8 @@ VIR_ENUM_IMPL(virDomainDiskMirrorState, VIR_DOMAIN_DISK_MIRROR_STATE_LAST,
> VIR_ENUM_IMPL(virDomainMemorySource, VIR_DOMAIN_MEMORY_SOURCE_LAST,
> "none",
> "file",
> - "anonymous")
> + "anonymous",
> + "memfd")
syntax-check would tell you thou shalt not use tabs
>
> VIR_ENUM_IMPL(virDomainMemoryAllocation, VIR_DOMAIN_MEMORY_ALLOCATION_LAST,
> "none",
[...]
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index 2fd8a2a268..4983669a34 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -3949,7 +3949,8 @@ qemuDomainDefValidateFeatures(const virDomainDef *def,
>
>
> static int
> -qemuDomainDefValidateMemory(const virDomainDef *def)
> +qemuDomainDefValidateMemory(const virDomainDef *def,
> + virQEMUCapsPtr qemuCaps)
> {
> const long system_page_size = virGetSystemPageSizeKB();
> const virDomainMemtune *mem = &def->mem;
> @@ -3971,6 +3972,13 @@ qemuDomainDefValidateMemory(const virDomainDef *def)
> return -1;
> }
>
> + if (mem->source == VIR_DOMAIN_MEMORY_SOURCE_MEMFD &&
> + !virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_MEMFD_HUGETLB)) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("hugepages is not support with memfd memory source"));
_("hugepages are not supported using memfd memory "
"source with this version of QEMU"));
> + return -1;
> + }
> +
> /* We can't guarantee any other mem.access
> * if no guest NUMA nodes are defined. */
> if (mem->hugepages[0].size != system_page_size &&
> @@ -4110,7 +4118,7 @@ qemuDomainDefValidate(const virDomainDef *def,
> if (qemuDomainDefValidateFeatures(def, qemuCaps) < 0)
> goto cleanup;
>
> - if (qemuDomainDefValidateMemory(def) < 0)
> + if (qemuDomainDefValidateMemory(def, qemuCaps) < 0)
> goto cleanup;
>
> ret = 0;
[...]
> diff --git a/tests/qemuxml2argvdata/memfd-memory-numa.xml b/tests/qemuxml2argvdata/memfd-memory-numa.xml
> new file mode 100644
> index 0000000000..8416a990fa
> --- /dev/null
> +++ b/tests/qemuxml2argvdata/memfd-memory-numa.xml
I don't recall from the original change, but each of the lines is
prefixed by 2 extra spaces... I'll fix before pushing.
I can fixup the nits noted. I'll wait until tomorrow before pushing so
that if Michal or Pavel wish to comment they can...
Reviewed-by: John Ferlan <jferlan at redhat.com>
John
> @@ -0,0 +1,36 @@
> + <domain type='kvm' id='56'>
> + <name>instance-00000092</name>
> + <uuid>126f2720-6f8e-45ab-a886-ec9277079a67</uuid>
> + <memory unit='KiB'>14680064</memory>
> + <currentMemory unit='KiB'>14680064</currentMemory>
> + <memoryBacking>
> + <hugepages>
> + <page size="2" unit="M"/>
> + </hugepages>
> + <source type='memfd'/>
> + <access mode='shared'/>
> + <allocation mode='immediate'/>
> + </memoryBacking>
> + <numatune>
> + <memnode cellid='0' mode='preferred' nodeset='3'/>
> + </numatune>
> + <vcpu placement='static'>8</vcpu>
> + <os>
> + <type arch='x86_64' machine='pc-i440fx-wily'>hvm</type>
> + <boot dev='hd'/>
> + </os>
> + <cpu>
> + <topology sockets='1' cores='8' threads='1'/>
> + <numa>
> + <cell id='0' cpus='0-7' memory='14680064' unit='KiB'/>
> + </numa>
> + </cpu>
> + <clock offset='utc'/>
> + <on_poweroff>destroy</on_poweroff>
> + <on_reboot>restart</on_reboot>
> + <on_crash>destroy</on_crash>
> + <devices>
> + <emulator>/usr/bin/qemu-system-x86_64</emulator>
> + <memballoon model='virtio'/>
> + </devices>
> + </domain>
[...]
More information about the libvir-list
mailing list