[Virtio-fs] Ways to uniquely and persistently identify nodes

Max Reitz mreitz at redhat.com
Fri Jan 17 18:13:26 UTC 2020


On 16.01.20 21:32, Vivek Goyal wrote:
> On Wed, Jan 15, 2020 at 02:00:27PM +0100, Max Reitz wrote:
>> On 15.01.20 13:01, Stefan Hajnoczi wrote:
>>> On Tue, Jan 14, 2020 at 08:24:51PM +0100, Miklos Szeredi wrote:
>>>> On Tue, Jan 14, 2020 at 6:13 PM Max Reitz <mreitz at redhat.com> wrote:
>>>>> What worries me most is how to pass that object around to all FUSE
>>>>> functions, and that they all need a new interface.
>>>>
>>>> You mean libfuse API?
>>>>
>>>>> I just had a very fuzzy (and maybe stupid) idea: Maybe we could keep an
>>>>> internal vector of currently active handles and then when variable-size
>>>>> handles are enabled, fuse_ino_t would just act as an index into that vector?
>>>>>
>>>>> I suppose we could then use the full-size handles in all messages and
>>>>> just hand out temporary indices to existing functions (just so we don’t
>>>>> have to change their interface).  Server and client have their own
>>>>> vectors, because when they communicate, only the full handles have meaning.
>>>>>
>>>>> Or we could implement the table on top of the current system by sharing
>>>>> it between the client and server.  Whenever the server creates a
>>>>> fuse_ino_t value, it then also creates a full-size handle, and returns
>>>>> both the handle and its corresponding fuse_ino_t value to the server.
>>>>> The server can use the fuse_ino_t normally most of the time, but with a
>>>>> catch: The client would be able to invalidate it.  Then the server needs
>>>>> to obtain a new fuse_ino_t value for the existing handle.
>>>>> (Invalidating and reacquiring a fuse_ino_t value would be new FUSE
>>>>> operations.)
>>>>
>>>> I think you may have accidentally switched "server" and "client" in
>>>> the above description a couple of times.
>>>>
>>>> But if I'm reading this correctly, your idea is to keep the 64bit
>>>> value on the interface (this could be just libfuse or both libfuse and
>>>> the kernel API) and add new interfaces to establish and reestablish
>>>> the mapping from (non-persistent) fuse_ino_t to (persistent) handle.
>>>
>>> I'm not sure I understand Max's idea.
>>>
>>> Miklos, I think you're saying keep fuse_ino_t semantics unchanged (not
>>> persistent, can be reused after the last reference is dropped) and add a
>>> separate struct export_operations-style FUSE API to map fuse_ino_t <->
>>> struct file_handle.
>>
>> Yes, that’d be the idea.
>>
>>> We'd need to use submounts to avoid st_ino collisions in that case.
>>
>> Yes, because that idea wouldn’t solve the st_ino problem.  (Only
>> persistent fuse_ino_t values would solve it.  This proposal is exactly
>> the opposite, we want them to be absolutely not persistent.)
>>
>> So we’d still want submounts with separate st_devs and pass through
>> st_ino from the host.
> 
> I am not sure I understand whole of the discussion. Here is my
> understanding. Please correct me if I got it wrong.
> 
> So we seem to have two main problems.
> 
> - st_inode number collision as seen by user space in case of multiple
>   submounts at host in directory being exported. And solution for this
>   seems to be supporting submounts where st_dev is reported from
>   submount in guest and st_ino is passthrough from host.
> 
> - Being able have a persistent handle for inode on host. This will allow
>   us to support export API on virtiofs and also allow server to trim
>   cache despite the fact that client still has reference to that inode.

It would also allow us to migrate virtiofsd, as long as the destination
can make sense of the existing handles.

Trimming the cache wasn’t really a problem AFAIU, but with the right
solution to the problem we can address that, too, yes.

As I wrote, I see three problems: (1) Inode collision, (2)
migrateability, (3) passing persistent file handles to the client.

> IIUC, first one is more pressing for us and second one can wait. Have
> I got it right?

It was my impression that file handles were more pressing in practice,
but I don’t know.  I’m currently trying to solve them all in one package.

> Having said that, I don't know how complex it is to support submounts
> and what are other implications.

As far as I’ve seen so far, it’s conceptually rather simple, the devil’s
just in the details.  (It’s easy to do a submount, the question is how
we can make that submount then use a relative base directory as its root.)

Max

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/virtio-fs/attachments/20200117/f56cf792/attachment.sig>


More information about the Virtio-fs mailing list