[Virtio-fs] One virtiofs daemon per exported dir requirement

Stefan Hajnoczi stefanha at redhat.com
Wed Feb 19 15:24:05 UTC 2020


On Tue, Feb 18, 2020 at 06:58:31PM +0000, Dr. David Alan Gilbert wrote:
> * Daniel Walsh (dwalsh at redhat.com) wrote:
> > On 2/18/20 1:29 PM, Daniel Walsh wrote:
> > > On 2/18/20 8:38 AM, Stefan Hajnoczi wrote:
> > >> On Fri, Feb 14, 2020 at 07:41:30PM +0000, Dr. David Alan Gilbert wrote:
> > >>> * Vivek Goyal (vgoyal at redhat.com) wrote:
> > >>>> Hi,
> > >>>>
> > >>>> Dan Walsh and Mrunal mentioned that one virtiofsd daemon per exported
> > >>>> directory requirement sounds excessive. For container use case, they have
> > >>>> atleast 2-3 more directories they need to export (secrets and /etc/host). And
> > >>>> that means 3-4 virtiofsd running for each kata container. 
> > >>>>
> > >>>> One option seems that bind mount all exports in one directory and export
> > >>>> that directory using one virtiofsd. I am aware of atleast one problem
> > >>>> with that configuraiton and that is possibility of inode number collision
> > >>>> if bind mounts are coming from different devices. Not sure how many
> > >>>> applications care though. Sergio is looking into solving this issue. It
> > >>>> might take a while though.
> > >>> I thought the bind mount setup was the normal setup seen under both Kata
> > >>> and k8s?
> > >> Kata Containers works as follows:
> > >>
> > >> kata-runtime manages a bind mount directory for each sandbox VM (k8s
> > >> pod) in /run/kata-containers/shared/sandboxes/$VM_ID.
> > >>
> > >> That directory contains the bind-mounted rootfs as well as resolv.conf
> > >> and other per-container files.
> > >>
> > >> When volumes (podman run --volume) are present they are also
> > >> bind-mounted alongside the rootfs.
> > >>
> > >> So kata-runtime ends up with something like this:
> > >>
> > >>   /run/kata-containers/shared/sandboxes/
> > >>   ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
> > >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
> > >>           ... rootfs/
> > >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-04b134d40c6255cf-hostname
> > >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-62cff51b641310e5-resolv.conf
> > >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-b8dedcdf0c623c40-hosts
> > >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-d181eeeb4171c3c5-myvolume/
> > >>
> > >> Only one virtio-fs device is used per sandbox VM.
> > >>
> > >> Stefan
> > >>
> > >> _______________________________________________
> > >> Virtio-fs mailing list
> > >> Virtio-fs at redhat.com
> > >> https://www.redhat.com/mailman/listinfo/virtio-fs
> > >
> > > Also what happens if some of the volumes are mounted as read/only? 
> > > What kind of error does the container process get when it attempts to
> > > write to the volume?
> > >
> > >
> > > _______________________________________________
> > > Virtio-fs mailing list
> > > Virtio-fs at redhat.com
> > > https://www.redhat.com/mailman/listinfo/virtio-fs
> > 
> > Also need to think about the volume being mounted as noexec, nodev,
> > nosuid,  Does the kernel inside of the container handle this correctly?
> 
> I'd need to check, but I *think* you'll get the error propagated
> directly from the errno that the daemon sees; i.e. you'll probably
> get the ro-fs error when trying to write the file on the mount that's
> ro.
> 
> I'm expecting it to behave like FUSE, since it's mostly the transport
> level that's changed.

There are two levels here: 1) kata-agent setups up per-container bind
mounts and 2) virtiofsd performs file system operations on behalf of the
guest.

kata-agent can apply the 'ro' mount option to a per-container bind
mount, so even if the kataShared virtio-fs mount as a whole is
read/write the container will have read-only access.

What I hope happens (but I haven't checked) is that kata-agent sets
mount options on per-container bind mounts.

If a file system operation does make its way through to the host, then
virtiofsd should fail because the bind mount on the host should also be
'ro'.

But does anyone want to check? :-)

Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/virtio-fs/attachments/20200219/8fceda9c/attachment.sig>


More information about the Virtio-fs mailing list