[Virtio-fs] Large memory consumption by virtiofsd, suspecting fd's aren't being closed?
Sergio Lopez
slp at redhat.com
Tue Mar 23 11:55:26 UTC 2021
On Mon, Mar 22, 2021 at 12:47:04PM -0400, Vivek Goyal wrote:
> On Mon, Mar 22, 2021 at 05:09:32PM +0100, Miklos Szeredi wrote:
> > On Mon, Mar 22, 2021 at 6:52 AM Eric Ernst <eric_ernst at apple.com> wrote:
> > >
> > > Hey ya’ll,
> > >
> > > One challenge I’ve been looking at is how to setup an appropriate memory cgroup limit for workloads that are leveraging virtiofs (ie, running pods with Kata Containers). I noticed that memory usage of the daemon itself can grow considerably depending on the workload; though much more than I’d expect.
> > >
> > > I’m running workload that simply runs a build on kernel sources with -j3. In doing this, the source of the linux kernel are shared via virtiofs (no DAX), so as the build goes on, there are a lot of files opened, closed, as well as created. The rss memory of virtiofsd grows into several hundreds of MBs.
> > >
> > > When taking a look, I’m suspecting that virtiofsd is carrying out the opens, but never actually closing fds. In the guest, I’m seeing fd’s on the order of 10-40 for all the container processes as it runs, whereas I see the number of fds for virtiofsd continually increasing, reaching over 80,000 fds. I’m guessing this isn’t expected?
> >
> > The reason could be that guest is keeping a ref on the inodes
> > (dcache->dentry->inode) and current implementation of server keeps an
> > O_PATH fd open for each inode referenced by the client.
> >
> > One way to avoid this is to use the "cache=none" option, which forces
> > the client to drop dentries immediately from the cache if not in use.
> > This is not desirable if cache is actually in use.
> >
> > The memory use of the server should still be limited by the memory use
> > of the guest: if there's memory pressure in the guest kernel, then it
> > will clean out caches, which results in the memory use decreasing in
> > the server as well. If the server memory use looks unbounded, that
> > might be indicative of too much memory used for dcache in the guest
> > (cat /proc/slabinfo | grep ^dentry). Can you verify?
>
> Hi Miklos,
>
> Apart from above, we identified one more issue on IRC. I asked Eric
> to drop caches manually in guest. (echo 3 > /proc/sys/vm/drop_caches)
> and while it reduced the fds open it did not seem to free up significant
> amount of memory.
>
> So question remains where is that memory. One possibility is that we
> have memory allocated for mapping arrays (inode and fd). These arrays
> only grow and never shrink. So they can lock down some memory.
>
> But still, lot of lo_inode memory should have been freed when
> echo 3 > /proc/sys/vm/drop_caches was done. Why all that did not
> show up in virtiofsd RSS usage, that's kind of little confusing.
Are you including "RssShmem" in "RSS usage"? If so, that could be
misleading. When virtiofsd[-rs] touches pages that reside in the
memory mapping that's shared with QEMU, those pages are accounted
in the virtiofsd[-rs] process's RssShmem too.
In other words, the RSS value of the virtiofsd[-rs] process may be
overinflated because it includes pages that are actually shared
(there's no a second copy of them) with the QEMU process.
This can be observed using a tool like "smem". Here's an example
- This virtiofsd-rs process appears to have a RSS of ~633 MiB
root 13879 46.1 7.9 8467492 649132 pts/1 Sl+ 11:33 0:52 ./target/debug/virtiofsd-rs
root 13947 69.3 13.4 5638580 1093876 pts/0 Sl+ 11:33 1:14 qemu-system-x86_64
- In /proc/13879/status we can observe most of that memory is
actually RssShmem:
RssAnon: 9624 kB
RssFile: 5136 kB
RssShmem: 634372 kB
- In "smem", we can see a similar amount of RSS, but the PSS is
roughly half the size because "smem" is splitting it up between
virtiofsd-rs and QEMU:
[root at localhost ~]# smem -P virtiofsd-rs -P qemu
PID User Command Swap USS PSS RSS
13879 root ./target/debug/virtiofsd-rs 0 13412 337019 662392
13947 root qemu-system-x86_64 -enable- 0 434224 760096 1094392
- If we terminate the virtiofsd-rs process, the output of "smem" now
shows that QEMU's PSS has grown to account for the PSS that was
previously assigned to virtiofsd-rs too, so we can confirm that was
memory shared between both processes.
PID User Command Swap USS PSS RSS
13947 root qemu-system-x86_64 -enable- 0 1082656 1084966 1095692
Just to be 100% sure, I've also run "heaptrack" on a virtiofsd-rs
instance, and can confirm that the actual heap usage of the process
was around 5-6 MiB.
Sergio.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/virtio-fs/attachments/20210323/c0a426db/attachment.sig>
More information about the Virtio-fs
mailing list