[Virtio-fs] [PATCH 3/4] virtiofsd: use file-backend memory region for virtiofsd's cache area
Dr. David Alan Gilbert
dgilbert at redhat.com
Wed Apr 17 14:51:21 UTC 2019
* Liu Bo (bo.liu at linux.alibaba.com) wrote:
> From: Xiaoguang Wang <xiaoguang.wang at linux.alibaba.com>
>
> When running xfstests test case generic/413, we found such issue:
> 1, create a file in one virtiofsd mount point with dax enabled
> 2, mmap this file, get virtual addr: A
> 3, write(fd, A, len), here fd comes from another file in another
> virtiofsd mount point without dax enabled, also note here write(2)
> is direct io.
> 4, this direct io will hang forever, because the virtiofsd has crashed.
> Here is the stack:
> [ 247.166276] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 247.167171] t_mmap_dio D 0 2335 2102 0x00000000
> [ 247.168006] Call Trace:
> [ 247.169067] ? __schedule+0x3d0/0x830
> [ 247.170219] schedule+0x32/0x80
> [ 247.171328] schedule_timeout+0x1e2/0x350
> [ 247.172416] ? fuse_direct_io+0x2e5/0x6b0 [fuse]
> [ 247.173516] wait_for_completion+0x123/0x190
> [ 247.174593] ? wake_up_q+0x70/0x70
> [ 247.175640] fuse_direct_IO+0x265/0x310 [fuse]
> [ 247.176724] generic_file_read_iter+0xaa/0xd20
> [ 247.177824] fuse_file_read_iter+0x81/0x130 [fuse]
> [ 247.178938] ? fuse_simple_request+0x104/0x1b0 [fuse]
> [ 247.180041] ? fuse_fsync_common+0xad/0x240 [fuse]
> [ 247.181136] __vfs_read+0x108/0x190
> [ 247.181930] vfs_read+0x91/0x130
> [ 247.182671] ksys_read+0x52/0xc0
> [ 247.183454] do_syscall_64+0x55/0x170
> [ 247.184200] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> And virtiofsd crashed because vu_gpa_to_va() can not handle guest physical
> address correctly. For a memory mapped area in dax mode, indeed the page
> for this area points virtiofsd's cache area, or rather virtio pci device's
> cache bar. In qemu, currently this cache bar is implemented with an anonymous
> memory and will not pass this cache bar's address info to vhost-user backend,
> so vu_gpa_to_va() will fail.
>
> To fix this issue, we create this vhost cache area with a file backend
> memory area.
Thanks,
I know there was another case of the daemon trying to access the
buffer that Stefan and Vivek hit, but fixed by persuading the kernel
not to do it; Stefan/Vivek: What do you think?
It worries me a little exposing the area back to the daemon; the guest
can write the BAR and change the mapping, I doubt anything would notice
that (but also I doubt it happens much).
Dave
> Reviewed-by: Liu Bo <bo.liu at linux.alibaba.com>
> Signed-off-by: Xiaoguang Wang <xiaoguang.wang at linux.alibaba.com>
> ---
> hw/virtio/vhost-user-fs.c | 59 +++++++++++++++++++++++++++++++++++++++--------
> 1 file changed, 49 insertions(+), 10 deletions(-)
>
> diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
> index a5110a9..fc2ed38 100644
> --- a/hw/virtio/vhost-user-fs.c
> +++ b/hw/virtio/vhost-user-fs.c
> @@ -300,6 +300,35 @@ static bool vuf_guest_notifier_pending(VirtIODevice *vdev, int idx)
> return vhost_virtqueue_pending(&fs->vhost_dev, idx);
> }
>
> +static int vuf_get_tmpfile(size_t size)
> +{
> + const char *tmpdir = g_get_tmp_dir();
> + gchar *fname;
> + int fd;
> +
> + fname = g_strdup_printf("%s/virtio-fs-cache-XXXXXX", tmpdir);
> + fd = mkstemp(fname);
> + if (fd < 0) {
> + fprintf(stderr, "%s: mkstemp(%s) failed\n", __func__, fname);
> + g_free(fname);
> + return -1;
> + }
> +
> + /*
> + * Note: this temp file would be deleted until file is closed or
> + * process exits.
> + */
> + unlink(fname);
> + if (ftruncate(fd, size) == -1) {
> + fprintf(stderr, "%s: ftruncate(%s) failed\n", __func__, fname);
> + close(fd);
> + fd = -1;
> + }
> +
> + g_free(fname);
> + return fd;
> +}
> +
> static void vuf_device_realize(DeviceState *dev, Error **errp)
> {
> VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> @@ -349,16 +378,26 @@ static void vuf_device_realize(DeviceState *dev, Error **errp)
> "no smaller than the page size");
> return;
> }
> - /* We need a region with some host memory, 'ram' is the easiest */
> - memory_region_init_ram_nomigrate(&fs->cache, OBJECT(vdev),
> - "virtio-fs-cache",
> - fs->conf.cache_size, NULL);
> - /*
> - * But we don't actually want anyone reading/writing the raw
> - * area with no cache data.
> - */
> - mprotect(memory_region_get_ram_ptr(&fs->cache), fs->conf.cache_size,
> - PROT_NONE);
> +
> + if (fs->conf.cache_size) {
> + int fd;
> +
> + fd = vuf_get_tmpfile(fs->conf.cache_size);
> + if (fd < 0) {
> + error_setg(errp, "failed to initialize virtio-fs-cache");
> + return;
> + }
> + memory_region_init_ram_from_fd(&fs->cache, OBJECT(vdev),
> + "virtio-fs-cache",
> + fs->conf.cache_size, true, fd, NULL);
> +
> + /*
> + * But we don't actually want anyone reading/writing the raw
> + * area with no cache data.
> + */
> + mprotect(memory_region_get_ram_ptr(&fs->cache),
> + fs->conf.cache_size, PROT_NONE);
> + }
>
> if (!vhost_user_init(&fs->vhost_user, &fs->conf.chardev, errp)) {
> return;
> --
> 1.8.3.1
>
> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs at redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs
--
Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK
More information about the Virtio-fs
mailing list