[Virtio-fs] [External] Re: host-user reconnection and crash recovery

Jiachen Zhang zhangjiachen.jaycee at bytedance.com
Thu May 13 08:51:09 UTC 2021


On Thu, May 13, 2021 at 4:26 PM Dr. David Alan Gilbert <dgilbert at redhat.com>
wrote:

> * Jiachen Zhang (zhangjiachen.jaycee at bytedance.com) wrote:
> > Hi Stefan and Sebastien,
> >
> > I think I should give some background context from my perspective.
> >
> > For the virtiofsd crash reconnection (recovery) to QEMU, as said by
> > Stefan, we discussed the possible implementation on the bi-weekly
> virtio-fs
> > call. I had also sent an RFC patch to the virtio-fs mail-list (
> >
> https://patchwork.kernel.org/project/qemu-devel/cover/20201215162119.27360-1-zhangjiachen.jaycee@bytedance.com/
> ),
> > we also have some discussion on the further revision direction in that
> > mail.
> >
> > We also have some needs to support virtiofsd crash recovery when it is
> used
> > with cloud-hypervisor (
> https://github.com/cloud-hypervisor/cloud-hypervisor).
> > However, the virtiofsd crash reconnection RFC patch relies on
> > QEMU's vhost-user socket reconnection feature and QEMU's vhost-user
> > inflight I/O tracking feature, which are both not supported by
> > cloud-hypervisor.
> >
> > So I also issued an initial pull-request of cloud-hypervisor vhost-user
> > socket reconnection (
> > https://github.com/cloud-hypervisor/cloud-hypervisor/pull/2387), which
> is
> > reviewed by Sebastien. Based on vhost-user socket reconnection, we also
> > want to further develop vhost-user inflight I/O tracking feature for
> > cloud-hypervisor, and finally to support virtiofsd crash reconnection.
> >
> > I am sorry for the delayed patch-revision of the two patch sets. I hope I
> > can free up some time in these two months to make some further progress.
>
> I'm curious what your use case is for virtiofsd crash
> recovery/reconnection - is there some reason you expect the daemon to
> crash or need to be restarted more than the whole VM?
>
> In the case of vhost-user networking with dpdk I can see the case where
> there is a central networking switch process shared between many VMs; so
> wanting to restart that without restarting all the VMs makes sense to
> me; where each VM has it's own virtiofsd I don't understand the use as
> much.
>
>
Hi Dave,

Yes, we want to restart virtiofsd without restarting the whole VM. One
reason is to avoid I/O hang caused by virtiofs daemon crash. Another
important reason to support virtiofsd live-upgrade for virtiofsd's bug or
security fixes based on virtiofsd reconnection.

All the best,
Jiachen



> Dave
>
> > All the best,
> > Jiachen
> >
> > On Tue, May 11, 2021 at 11:02 PM Boeuf, Sebastien <
> sebastien.boeuf at intel.com>
> > wrote:
> >
> > > Hi Stefan,
> > >
> > > Thanks for the explanation.
> > >
> > > So reconnection for vhost-user is not a well defined behavior,
> > > and QEMU is doing its best to retry when possible, depending
> > > on each device.
> > >
> > > The guest does not know about it, so it's never notified that
> > > the device needs to be reset.
> > >
> > > But what about the vhost-user backend initialization? Does
> > > QEMU go again through initializing memory table, vrings, etc...
> > > since it can't assume anything from the backend?
> > >
> > > Thanks,
> > > Sebastien
> > >
> > > ------------------------------
> > > *From:* Stefan Hajnoczi
> > > *Sent:* Tuesday, May 11, 2021 2:45 PM
> > > *To:* Boeuf, Sebastien
> > > *Cc:* virtio-fs at redhat.com; qemu-devel at nongnu.org
> > > *Subject:* vhost-user reconnection and crash recovery
> > >
> > > Hi Sebastien,
> > > On #virtio-fs IRC you asked:
> > >
> > >  I have a vhost-user question regarding disconnection/reconnection. How
> > >  should this be handled? Let's say the vhost-user backend disconnects,
> > >  and reconnects later on, does QEMU reset the virtio device by
> notifying
> > >  the guest? Or does it simply reconnects to the backend without letting
> > >  the guest know about what happened?
> > >
> > > The vhost-user protocol does not have a generic reconnection solution.
> > > Reconnection is handled on a case-by-case basis because device-specific
> > > and implementation-specific state is involved.
> > >
> > > The vhost-user-fs-pci device in QEMU has not been tested with
> > > reconnection as far as I know.
> > >
> > > The ideal reconnection behavior is to resume the device from its
> > > previous state without disrupting the guest. Device state must survive
> > > reconnection in order for this to work. Neither QEMU virtiofsd nor
> > > virtiofsd-rs implement this today.
> > >
> > > virtiofs has a lot of state, making it particularly difficult to
> support
> > > either DEVICE_NEEDS_RESET or transparent vhost-user reconnection. We
> > > have discussed virtiofs crash recovery on the bi-weekly virtiofs call
> > > (https://etherpad.opendev.org/p/virtiofs-external-meeting). If you
> want
> > > to work on this then joining the call would be a good starting point to
> > > coordinate with others.
> > >
> > > One approach for transparent crash recovery is for virtiofsd to keep
> its
> > > state in tmpfs (e.g. inode/fd mappings) and open fds shared with a
> > > clone(2) process via CLONE_FILES. This way the virtiofsd process can
> > > terminate but its state persists in memory thanks to its clone process.
> > > The clone can then be used to launch the new virtiofsd process from the
> > > old state. This would allow the device to resume transparently with
> QEMU
> > > only reconnecting the vhost-user UNIX domain socket. This is an idea
> > > that we discussed in the bi-weekly virtiofs call.
> > >
> > > You mentioned device reset. VIRTIO 1.1 has the Device Status Field
> > > DEVICE_NEEDS_RESET flat that the device can use to tell the driver that
> > > a reset is necessary. This feature is present in the specification but
> > > not implemented in the Linux guest drivers. Again the reason is that
> > > handling it requires driver-specific logic for restoring state after
> > > reset...otherwise the device reset would be visible to userspace.
> > >
> > > Stefan
> > >
> > > ---------------------------------------------------------------------
> > > Intel Corporation SAS (French simplified joint stock company)
> > > Registered headquarters: "Les Montalets"- 2, rue de Paris,
> > > 92196 Meudon Cedex, France
> > > Registration Number:  302 456 199 R.C.S. NANTERRE
> > > Capital: 4,572,000 Euros
> > >
> > > This e-mail and any attachments may contain confidential material for
> > > the sole use of the intended recipient(s). Any review or distribution
> > > by others is strictly prohibited. If you are not the intended
> > > recipient, please contact the sender and delete all copies.
> > > _______________________________________________
> > > Virtio-fs mailing list
> > > Virtio-fs at redhat.com
> > > https://listman.redhat.com/mailman/listinfo/virtio-fs
> > >
>
> > _______________________________________________
> > Virtio-fs mailing list
> > Virtio-fs at redhat.com
> > https://listman.redhat.com/mailman/listinfo/virtio-fs
>
> --
> Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/virtio-fs/attachments/20210513/633f0f35/attachment.htm>


More information about the Virtio-fs mailing list