[Virtio-fs] [External] Re: [RFC PATCH 0/9] Support for Virtio-fs daemon crash reconnection

Jiachen Zhang zhangjiachen.jaycee at bytedance.com
Wed Mar 17 12:57:47 UTC 2021


On Wed, Mar 17, 2021 at 7:50 PM Christian Schoenebeck <
qemu_oss at crudebyte.com> wrote:

> On Mittwoch, 17. März 2021 11:05:32 CET Stefan Hajnoczi wrote:
> > On Fri, Dec 18, 2020 at 05:39:34PM +0800, Jiachen Zhang wrote:
> > > Thanks for the suggestions. Actually, we choose to save all state
> > > information to QEMU because a virtiofsd has the same lifecycle as its
> > > QEMU master. However, saving things to a file do avoid communication
> with
> > > QEMU, and we no longer need to increase the complexity of vhost-user
> > > protocol. The suggestion to save fds to the systemd is also very
> > > reasonable
> > > if we don't consider the lifecycle issues, we will try it.
> >
> > Hi,
> > We recently discussed crash recovery in the virtio-fs bi-weekly call and
> > I read some of this email thread because it's a topic I'm interested in.
>
> I just had a quick fly over the patches so far. Shouldn't there be some
> kind
> of constraint for an automatic reconnection feature after a crash to
> prevent
> this being exploited by ROP brute force attacks?
>
> E.g. adding some (maybe continuously increasing) delay and/or limiting the
> amount of reconnects within a certain time frame would come to my mind.
>
> Best regards,
> Christian Schoenebeck
>
>
>

Thanks, Christian. I am still trying to figure out the details of the ROP
attacks.

However, QEMU's vhost-user reconnection is based on chardev socket
reconnection. The socket reconnection can be enabled by the "--chardev
socket,...,reconnect=N" in QEMU command options, in which N means QEMU will
try to connect the disconnected socket every N seconds. We can increase N
to increase the reconnect delay. If we want to change the reconnect delay
dynamically, I think we should change the chardev socket reconnection code.
It is a more generic mechanism than vhost-user-fs and vhost-user backend.

By the way, I also considered the socket reconnection delay time in the
performance aspect. As the reconnection delay increase, if an application
in the guest is doing I/Os, it will suffer larger tail latency. And for
now, the smallest delay is 1 second, which is rather large for
high-performance virtual I/O devices today. I think maybe a more performant
and safer reconnect delay adjustment mechanism should be considered in the
future. What are your thoughts?


Jiachen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/virtio-fs/attachments/20210317/0d397560/attachment.htm>


More information about the Virtio-fs mailing list