[Virtio-fs] virtiofsd: doesn't garant write access at users allowed by group permission

tecufanujacu at tutanota.com tecufanujacu at tutanota.com
Fri Jun 4 18:30:30 UTC 2021


Thanks for this great explanation, now everything is more clear to me.

Basically this problem happens every time there is permissions configuration just a little bit more advanced and at the moment there isn't an on the fly solution that doesn't need kernel patching and recompiling.

Good to know.


2 giu 2021, 10:32 da chirantan at chromium.org:

> On Wed, Jun 2, 2021 at 1:01 AM <tecufanujacu at tutanota.com> wrote:
>
>>
>> Hello to everyone,
>>
>> I'm trying virtiofsd with proxmox from few weeks and I noticed a problem, virtiofsd doesn't garant write access at users allowed by group permission.
>>
>> The virtiofsd bin included in proxmox is v 7.32, but I have also tested the bins compiled from source from stable branch (v7.27) and dev branch (v7.33), and I also have tested the bin virtiofsd-rs. Always same problem.
>>
>> I opened an issue on gitlab and I asked in the proxmox forum but with almost zero interaction. Seems to me that virtiofsd isn't a lot used. Someone suggested me to write in the mailinglist for these technical things and here I am.
>>
>> To better make understand what problem I'm noticing I link the issue opened in the gitlab in which I have reported a lot of useful info and logs with a good formatting:
>> https://gitlab.com/qemu-project/qemu/-/issues/368
>>
>> To me seems really strange what is happening, am I doing some error or this really is a virtiofsd bug?
>>
>
> I'm pretty sure this is a virtiofsd issue because we ran into the same
> problem on crosvm.  The issue is that the server changes its uid/gid
> to the uid/gid in the fuse context before making the syscall.  This
> ensures that the file/directory appears atomically with the correct
> metadata.  However, this causes an EACCES error when the following
> conditions are
> met:
>
> * The parent directory has g+rw permissions with gid A
> * The process has gid B but has A in its list of supplementary groups
>
> In this case the fuse context only contains gid B, which doesn't have
> permission to modify the parent directory.
>
> There are a couple of ways to fix this problem:
>
> The first one we tried was to split file/directory creation into 2
> stages [1].  Basically for files we first create a temporary with
> O_TMPFILE and then initialize the metadata before linking it into the
> directory tree.  The main issue with this is that we're duplicating
> the work that kernel already does on open and turning a single syscall
> in the VM into several syscalls on the host, which adds a significant
> amount of latency.  You also have to deal with a bunch of esoteric
> corner cases for file systems that the kernel would normally just
> handle automatically [2][3].  For directories, there is no O_TMPFILE
> equivalent so we had to do a gross hack of creating a directory with a
> random name and then renaming it to the correct one once all the
> metadata was properly initialized.  In theory you could create the
> directory in a separate "work dir" first but you have to be careful if
> the original directory uses selinux.  From what I understand, rename
> preserves the security context so to ensure the security context is
> properly inherited from the parent directory you need to create a new
> directory anyway.  Or figure out what the correct context should be
> and set it in the work dir before the rename.
>
> The second solution, which is also what we're using now, is to set the
> SECBIT_NO_SETUID_FIXUP flag.  This flag prevents the kernel from
> dropping caps when the process changes its uid/gid so the permission
> checks are skipped as long as the server has the appropriate
> capabilities (CAP_DAC_OVERRIDE, I think?).  Doing this lets us drop
> all the special handling code and just go back to making a single
> syscall and letting the kernel figure out the rest [4].  The crosvm fs
> device always runs as root in a user namespace with the proper caps so
> this works for us but obviously will not work if virtiofsd doesn't
> have the caps to begin with.  At that point the first option may be
> the only choice.
>
> I guess a third option is to change the fuse protocol so that it also
> includes the supplementary groups of the calling process.  Then the
> server can also set its supplementary groups the same way before
> making the syscall.  Merging the necessary changes into the kernel is
> left as an exercise for the reader ;-)
>
>
> Chirantan
>
> [1]: https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/2217534
> [2]: https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/2253493
> [3]: https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/2260253
> [4]: https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/2684067
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/virtio-fs/attachments/20210604/f6ccf7be/attachment.htm>


More information about the Virtio-fs mailing list