[vfio-users] RLIMIT_MEMLOCK exceeded when enabling intremap

Alex Williamson alex.williamson at redhat.com
Tue Jul 3 20:56:03 UTC 2018


On Tue, 3 Jul 2018 18:42:04 +0530
Prasun Ratn <prasun.ratn at gmail.com> wrote:

> Hi
> 
> I am adding an IOMMU device using the following libvirt syntax (taken
> from https://libvirt.org/formatdomain.html#elementsIommu)
> 
>     <iommu model='intel'>
>       <driver intremap='on'/>
>     </iommu>
> 
> When I try to start the VM, it fails. If I remove the above 3 lines it
> starts fine.
> 
>     error: Failed to start domain rhel7.3-32T-nvme-ich9
>     error: internal error: qemu unexpectedly closed the monitor:
> 2018-06-28T15:24:31.401831Z qemu-kvm: -device
> vfio-pci,host=82:00.0,id=hostdev1,bus=pcie.0,addr=0xa: VFIO_MAP_DMA:
> -12
>     2018-06-28T15:24:31.401854Z qemu-kvm: -device
> vfio-pci,host=82:00.0,id=hostdev1,bus=pcie.0,addr=0xa:
> vfio_dma_map(0x556478bc0820, 0xc0000, 0x7ff40000, 0x7fd94e4c0000) =
> -12 (Cannot allocate memory)
>     2018-06-28T15:24:31.450793Z qemu-kvm: -device
> vfio-pci,host=82:00.0,id=hostdev1,bus=pcie.0,addr=0xa: VFIO_MAP_DMA:
> -12
>     2018-06-28T15:24:31.450804Z qemu-kvm: -device
> vfio-pci,host=82:00.0,id=hostdev1,bus=pcie.0,addr=0xa:
> vfio_dma_map(0x556478bc0820, 0x100000000, 0x180000000, 0x7fd9ce400000)
> = -12 (Cannot allocate memory)
>     2018-06-28T15:24:31.450878Z qemu-kvm: -device
> vfio-pci,host=82:00.0,id=hostdev1,bus=pcie.0,addr=0xa: vfio error:
> 0000:82:00.0: failed to setup container for group 37: memory listener
> initialization failed for container: Cannot allocate memory
> 
> In dmesg I see this:
> 
>     [189435.289113] vfio_pin_pages_remote: RLIMIT_MEMLOCK (9663676416) exceeded
>     [189435.338165] vfio_pin_pages_remote: RLIMIT_MEMLOCK (9663676416) exceeded
> 
> I have enough free memory (I think) and at the failing point enough
> memory seems to be available.
> 
>     $ free -h
>                   total        used        free      shared
> buff/cache   available
>     Mem:           125G        1.4G        123G         17M
> 1.1G        123G
>     Swap:          1.0G          0B        1.0G
> 
> Here's the ulimit -l output (I changed limits.conf to set memlock to
> unlimited for qemu user and qemu group)
> 
>     $ ulimit -l
>     unlimited
> 
> 
>     $ sudo -u qemu sh -c "ulimit -l"
>     unlimited
> 
> memlock limit using systemctl
> 
>     $ systemctl show libvirtd.service | grep LimitMEMLOCK
>     LimitMEMLOCK=18446744073709551615
> 
> SELinux is disabled
> 
>     $ sestatus
>     SELinux status:                 disabled
> 
> 
> libvirt and kernel version
> 
>     $ virsh version
>     Compiled against library: libvirt 4.1.0
>     Using library: libvirt 4.1.0
>     Using API: QEMU 4.1.0
>     Running hypervisor: QEMU 2.9.0
> 
>     $ uname -r
>     3.10.0-693.5.2.el7.x86_64
> 
>     $ cat /etc/redhat-release
>     Red Hat Enterprise Linux Server release 7.4 (Maipo)
> 
> Any idea how to figure out why we are exceeding the memlock limit?

I'm guessing you're assigning multiple devices to the same VM, which
doesn't work well with a guest IOMMU currently.  The trouble is that
with a guest IOMMU, each assigned device has a separate address space
that is initially configured to map the full address space of the VM
and each vfio container for each device is accounted separately.
libvirt will only set the locked memory limit to a value sufficient for
locking the memory once, whereas in this configuration we're locking it
once per assigned device.  Without a guest IOMMU, all devices run in
the same address space and therefore the same container, and we only
account the memory once for any number of devices.

Regardless of all your attempts to prove that the locked memory limit
is set to unlimited, I don't think that's actually the case for the
running qemu instance.  You should be able to use the hard_limit option
in the VM xml to increase the locked memory limit:

https://libvirt.org/formatdomain.html#elementsMemoryTuning

As above, I'd suggest <# hostdevs> x <VM memory size>

The next question would be, why are you trying to use a guest IOMMU in
the first place?  The typical "production" use case of this is to be
able to make use of userspace drivers, like DPDK, in the guest
userspace.  Device assignment to nested guest is also possible, but
beyond proof of concept or development work, I don't know a practical
use for it.  If your intent is to get isolation between devices in the
guest drivers (ie. not using iommu=pt in the guest), expect horrendous
performance.  Thanks,

Alex




More information about the vfio-users mailing list