[libvirt] [PATCH] Fix a deadlock in bi-directional p2p concurrent migration.

Chris Lalancette clalance at redhat.com
Tue Jul 20 14:00:22 UTC 2010


On 07/20/10 - 11:53:43AM, Daniel P. Berrange wrote:
> On Fri, Jul 16, 2010 at 09:38:10AM -0400, Chris Lalancette wrote:
> > If you try to execute two concurrent migrations p2p
> > from A->B and B->A, the two libvirtd's will deadlock
> > trying to perform the migrations.  The reason for this is
> > that in p2p migration, the libvirtd's are responsible for
> > making the RPC Prepare, Migrate, and Finish calls.  However,
> > they are currently holding the driver lock while doing so,
> > which basically guarantees deadlock in this scenario.
> > 
> > This patch fixes the situation by adding
> > qemuDomainObjEnterRemoteWithDriver and
> > qemuDomainObjExitRemoteWithDriver helper methods.  The Enter
> > take an additional object reference, then drops both the
> > domain object lock and the driver lock.  The Exit takes
> > both the driver and domain object lock, then drops the
> > reference.  Adding calls to these Enter and Exit helpers
> > around remote calls in the various migration methods
> > seems to fix the problem for me in testing.
> > 
> > This should make the situation safe. The additional domain
> > object reference ensures that the domain object won't disappear
> > while this operation is happening.  The BeginJob that is called
> > inside of qemudDomainMigratePerform ensures that we can't execute a
> > second migrate (or shutdown, or save, etc) job while the
> > migration is active.  Finally, the additional check on the state
> > of the vm after we reacquire the locks ensures that we can't
> > be surprised by an external event (domain crash, etc).
> > 
> > Signed-off-by: Chris Lalancette <clalance at redhat.com>
> 
> ACK
> 
> Daniel

Thanks, I've pushed this now.

--
Chris Lalancette




More information about the libvir-list mailing list