[libvirt] [PATCH v2 01/15] Add event and state details for post-copy

Jiri Denemark jdenemar at redhat.com
Fri Jan 22 15:28:55 UTC 2016


On Fri, Jan 22, 2016 at 15:23:43 +0000, Daniel P. Berrange wrote:
> On Fri, Jan 22, 2016 at 04:17:42PM +0100, Jiri Denemark wrote:
> > On Fri, Jan 22, 2016 at 15:07:04 +0000, Daniel P. Berrange wrote:
> > > On Thu, Jan 21, 2016 at 11:20:46AM +0100, Jiri Denemark wrote:
> > > > VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY and VIR_DOMAIN_PAUSED_POSTCOPY are
> > > > used on the source host once migration enters post-copy mode (which
> > > > means the domain gets paused on the source. After the destination host
> > > > takes over the execution of the domain, its virtual CPUs are resumed and
> > > > the domain enters VIR_DOMAIN_RUNNING_POSTCOPY state and
> > > > VIR_DOMAIN_EVENT_RESUMED_POSTCOPY event is emitted.
> > > > 
> > > > In case migration fails during post-copy mode and none of the hosts have
> > > > complete state of the domain, both domains will remain paused with
> > > > VIR_DOMAIN_PAUSED_POSTCOPY_FAILED reason and an upper layer may decide
> > > > what to do.
> > > > 
> > > > Signed-off-by: Jiri Denemark <jdenemar at redhat.com>
> > > 
> > > > @@ -2380,6 +2383,8 @@ typedef enum {
> > > >      VIR_DOMAIN_EVENT_SUSPENDED_RESTORED = 4,  /* Restored from paused state file */
> > > >      VIR_DOMAIN_EVENT_SUSPENDED_FROM_SNAPSHOT = 5, /* Restored from paused snapshot */
> > > >      VIR_DOMAIN_EVENT_SUSPENDED_API_ERROR = 6, /* suspended after failure during libvirt API call */
> > > > +    VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY = 7, /* suspended for post-copy migration */
> > > > +    VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY_FAILED = 8, /* suspended after failed post-copy */
> > > 
> > > Presumably the POSTCOPY_FAILED event can only be emitted
> > > on the target, since the source will already be suspended
> > > when we see a failure, and it doesn't make sense to issue
> > > a suspended event when we're already suspended.
> > 
> > But would it cause any harm? I figured it might be better to emit the
> > event and set the state to POSTCOPY_FAILED even on the source so that
> > apps/users don't have to guess whether POSTCOPY means it's still running
> > or if it already failed.
> 
> The lifecycle events are supposed to be implementing a state machine,
> and we're not changing state in this case. I think applications that
> are currently using libvirt would reasonably consider it an error if
> libvirt issues an event for a state it is already in, and I could see
> it causing them to mistakenly run some logic twice if they get two
> SUSPEND events for the same domain in a row.

We already emit some events several times in a row, but I agree it
doesn't make sense to add more cases like that. It would actually be a
good idea to fix the existing double events (in another patch series in
the future).

Jirka




More information about the libvir-list mailing list