[Virtio-fs] [PATCH v3 1/1] vhost-user-fs: add migration type property

Tue Feb 28 21:24:16 UTC 2023

On Tue, Feb 28, 2023 at 07:59:54PM +0200, Anton Kuchin wrote:
> On 28/02/2023 16:57, Michael S. Tsirkin wrote:
> > On Tue, Feb 28, 2023 at 04:30:36PM +0200, Anton Kuchin wrote:
> > > I really don't understand why and what do you want to check on
> > > destination.
> > Yes I understand your patch controls source. Let me try to rephrase
> > why I think it's better on destination.
> > Here's my understanding
> > - With vhost-user-fs state lives inside an external daemon.
> > A- If after load you connect to the same daemon you can get migration mostly
> >    for free.
> > B- If you connect to a different daemon then that daemon will need
> >    to pass information from original one.
> > 
> > Is this a fair summary?
> > 
> > Current solution is to set flag on the source meaning "I have an
> > orchestration tool that will make sure that either A or B is correct".
> > 
> > However both A and B can only be known when destination is known.
> > Especially as long as what we are really trying to do is just allow qemu
> > restarts, Checking the flag on load will thus achive it in a cleaner
> > way, in that orchestration tool can reasonably keep the flag
> > clear normally and only set it if restarting qemu locally.
> > 
> > 
> > By comparison, with your approach orchestration tool will have
> > to either always set the flag (risky since then we lose the
> > extra check that we coded) or keep it clear and set before migration
> > (complex).
> > 
> > I hope I explained what and why I want to check.
> > 
> > I am far from a vhost-user-fs expert so maybe I am wrong but
> > I wanted to make sure I got the point across even if other
> > disagree.
> > 
> 
> Thank you for the explanation. Now I understand your concerns.
> 
> You are right about this mechanism being a bit risky if orchestrator is
> not using it properly or clunky if it is used in a safest possible way.
> That's why first attempt of this feature was with migration capability
> to let orchestrator choose behavior right at the moment of migration.
> But it has its own problems.
> 
> We can't move this check only to destination because one of main goals
> was to prevent orchestrators that are unaware of vhost-user-fs specifics
> from accidentally migrating such VMs. We can't rely here entirely on
> destination to block this because if VM is migrated to file and then
> can't be loaded by destination there is no way to fallback and resume
> the source so we need to have some kind of blocker on source by default.

Interesting.  Why is there no way? Just load it back on source? Isn't
this how any other load failure is managed? Because for sure you
need to manage these, they will happen.

> Said that checking on destination would need another flag and the safe
> way of using this feature would require managing two flags instead of one
> making it even more fragile. So I'd prefer not to make it more complex.
>
> In my opinion the best way to use this property by orchestrator is to
> leave default unmigratable behavior at start and just before migration when
> destination is known enumerate all vhost-user-fs devices and set properties
> according to their backends capability with QMP like you mentioned. This
> gives us single point of making the decision for each device and avoids
> guessing future at VM start.

this means that you need to remember what the values were and then
any failure on destination requires you to go back and set them
to original values. With possibility of crashes on the orchestrator
you also need to recall the temporary values in some file ...
This is huge complexity much worse than two flags.

Assuming we need two let's see whether just reload on source is good
enough.

> But allowing setup via command-line is valid too because some backends may
> always be capable of external migration independent of hosts and don't need
> the manipulations with QMP before migration at all.

I am much more worried that the realistic schenario is hard to manage
safely than about theoretical state migrating backends that don't exist.

-- 
MST