[libvirt-users] Clock problems on live migration

Paul Boven boven at jive.nl
Mon Mar 24 18:14:17 UTC 2014


Hi everyone,

While doing a live migration, Linux guests will frequently get stuck and 
become unresponsive, while the CPU utilization on the host for that 
guest goes to 100%. Sometimes they recover, and dmesg then shows that 
there's been a clock problem during the live migration:

Clocksource tsc unstable (delta = 35882846234 ns)

So the TSC did a jump of nearly 36 seconds.

Migrations often fail when going from server A to B, but will then work 
fine in the other direction.

Both servers are locked to the same NTP source, and are well within 1ms 
from one another.

Both hosts are running Ubuntu 13.04 with these versions (from Ubuntu 
packages):

Kernel: 3.8.0-35-generic x86_64
Libvirt: 1.0.2
Qemu: 1.4.0
Gluster-fs: 3.4.2 (libvirt access the images via the filesystem, not 
using libgfapi yet).
The interconnect between both machines (both for migration and gluster) 
is 10GbE.

We have different guests (all Ubuntu releases, 13.04 and 13.10), and 
they all seem to be affected.

Clocksource: kvm-clock on all guests.
Clock entry from the guest XML: <clock offset='utc'/>

Now as far as I've read in the documentation of kvm-clock, it 
specifically supports live migrations, so I'm a bit surprised at these 
problems. There isn't all that much information to find on these issue, 
although I have found postings by others that seem to have run into the 
same issues, but without a solution.

Any help would be much appreciated.

Regards, Paul Boven.
-- 
Paul Boven <boven at jive.nl> +31 (0)521-596547
Unix/Linux/Networking specialist
Joint Institute for VLBI in Europe - www.jive.nl
VLBI - It's a fringe science




More information about the libvirt-users mailing list