[libvirt] why not shutdown the src vm to avoid split-brain

zhang bo oscar.zhangbo at huawei.com
Sun Feb 15 06:36:04 UTC 2015


In func doPeer2PeerMigrate3(), in the "finish" step, it checks whether domainMigrateFinish3() returns NULL or not.
if it(ddomain) is NULL, it just restarts the guest on the source.

Please consider the scenario that the ddomain has already been running on the dest, but it fails to tell the source
this fact, and ddomain becomes NULL. If we then restart the guest on the source, there will be 2 same guests running
on both sides, and a SPLIT-BRAIN occurs.

It seems much better to stop them both , rather than leaving them both running. At least, when we found the ddomain
is NULL, we should probably check whether the problem is caused by keepAlive failure, if so, kill the guest on the source
rather than restarting it.

How do you think about that?


BTW, it says that: "The lock manager plugins should take care of safety in this scenario" in the comment,
with the commit 2593f9692df0f128b14cde811e18aa49c1cf3e06, I don't quite understand that:
1) If we migrate the guest with the flag VIR_MIGRATE_NON_SHARED_DISK, then nbd server may take care of the data
consistency, But before it starts the cpus on the dest, the nbd server is already stopped. So, at this moment, no
one takes care of this problem.
2) If we migrate the guest with a shared disk, then does it mean that the nfs or other shareing-disk schemas should
prevent split-brain by themselves?







More information about the libvir-list mailing list