[Virtio-fs] [PATCH] vhost-user-fs: add capability to allow migration

Stefan Hajnoczi stefanha at gmail.com
Tue Jan 24 12:48:12 UTC 2023


On Tue, Jan 24, 2023, 04:50 Dr. David Alan Gilbert <dgilbert at redhat.com>
wrote:

> * Stefan Hajnoczi (stefanha at gmail.com) wrote:
> > On Mon, 23 Jan 2023 at 14:54, Stefan Hajnoczi <stefanha at redhat.com>
> wrote:
> > >
> > > On Mon, Jan 23, 2023 at 06:27:23PM +0000, Dr. David Alan Gilbert wrote:
> > > > * Michael S. Tsirkin (mst at redhat.com) wrote:
> > > > > On Sun, Jan 22, 2023 at 06:09:40PM +0200, Anton Kuchin wrote:
> > > > > >
> > > > > > On 22/01/2023 16:46, Michael S. Tsirkin wrote:
> > > > > > > On Sun, Jan 22, 2023 at 02:36:04PM +0200, Anton Kuchin wrote:
> > > > > > > > > > This flag should be set when qemu don't need to worry
> about any
> > > > > > > > > > external state stored in vhost-user daemons during
> migration:
> > > > > > > > > > don't fail migration, just pack generic virtio device
> states to
> > > > > > > > > > migration stream and orchestrator guarantees that the
> rest of the
> > > > > > > > > > state will be present at the destination to restore full
> context and
> > > > > > > > > > continue running.
> > > > > > > > > Sorry  I still do not get it.  So fundamentally, why do we
> need this property?
> > > > > > > > > vhost-user-fs is not created by default that we'd then
> need opt-in to
> > > > > > > > > the special "migrateable" case.
> > > > > > > > > That's why I said it might make some sense as a device
> property as qemu
> > > > > > > > > tracks whether device is unplugged for us.
> > > > > > > > >
> > > > > > > > > But as written, if you are going to teach the orchestrator
> about
> > > > > > > > > vhost-user-fs and its special needs, just teach it when to
> migrate and
> > > > > > > > > where not to migrate.
> > > > > > > > >
> > > > > > > > > Either we describe the special situation to qemu and let
> qemu
> > > > > > > > > make an intelligent decision whether to allow migration,
> > > > > > > > > or we trust the orchestrator. And if it's the latter, then
> 'migrate'
> > > > > > > > > already says orchestrator decided to migrate.
> > > > > > > > The problem I'm trying to solve is that most of vhost-user
> devices
> > > > > > > > now block migration in qemu. And this makes sense since qemu
> can't
> > > > > > > > extract and transfer backend daemon state. But this prevents
> us from
> > > > > > > > updating qemu executable via local migration. So this flag is
> > > > > > > > intended more as a safety check that says "I know what I'm
> doing".
> > > > > > > >
> > > > > > > > I agree that it is not really necessary if we trust the
> orchestrator
> > > > > > > > to request migration only when the migration can be
> performed in a
> > > > > > > > safe way. But changing the current behavior of vhost-user-fs
> from
> > > > > > > > "always blocks migration" to "migrates partial state whenever
> > > > > > > > orchestrator requests it" seems a little  dangerous and can
> be
> > > > > > > > misinterpreted as full support for migration in all cases.
> > > > > > > It's not really different from block is it? orchestrator has
> to arrange
> > > > > > > for backend migration. I think we just assumed there's no
> use-case where
> > > > > > > this is practical for vhost-user-fs so we blocked it.
> > > > > > > But in any case it's orchestrator's responsibility.
> > > > > >
> > > > > > Yes, you are right. So do you think we should just drop the
> blocker
> > > > > > without adding a new flag?
> > > > >
> > > > > I'd be inclined to. I am curious what do dgilbert and stefanha
> think though.
> > > >
> > > > Yes I think that's probably OK, as long as we use the flag for
> knowing
> > > > how to handle the discard bitmap as a proxy for the daemon knowing
> how
> > > > to handle *some* migrations; knowing which migrations is then the job
> > > > for the orchestrator to be careful of.
> > >
> > > I think the feature bit is not a good way to detect live migration
> > > support. vhost-user backends typically use libvhost-user, rust-vmm's
> > > vhost-user-backend crate, etc where this feature can be implemented for
> > > free. If the feature bit is advertized we don't know if the device
> > > implementation (net, blk, fs, etc) is aware of migration at all.
> >
> > I checked how bad the situation is. libvhost-user currently enables
> > LOG_ALL by default. :(
> >
> > So I don't think the front-end can use LOG_ALL alone to determine
> > whether or not migration is supported by the back-end.
> >
> > There are several existing back-ends based on libvhost-user that have
> > no concept of reconnection or migration but report the LOG_ALL feature
> > bit.
>
> Ouch, yes that's messy.
>
> Going back to the original question; I don't think a command line flag
> will work though, because even for a given VM there's the possibility
> of some (local) migrations working but other (remote) migrations not
> working; so you don't know at the point you start the VM whether
> your migrations are going to work.
>

The user or management tool should know which types of migration a
vhost-user-fs backend supports. That can be passed in as a per-device
parameter.

Then a migration parameter can be used to distinguish between same host and
remote host migration? QEMU already distinguishes between pre-copy and
post-copy migration, so this can be thought of as yet another type of
migration.

Stefan

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/virtio-fs/attachments/20230124/a961639e/attachment.htm>


More information about the Virtio-fs mailing list