[libvirt] time: event poll may become un-triggerable after changing system clock.

Daniel P. Berrange berrange at redhat.com
Tue May 12 08:56:13 UTC 2015

On Tue, May 12, 2015 at 04:14:37PM +0800, zhang bo wrote:
> event poll may become un-triggerable after changing system clock. 
> The steps to reproduce the problem:
> 1 run event-test
> 1 define and start a domain with name vm1.
> 2 destroy vm1
> 3 change system time to 1 hour before when timer.expiresAt has been set in virEventPollUpdateTimeout 
>   (and before virEventPollDispatchTimeouts()).
> 4 event-test will recive no message until 1 hour later.
> The reasons for the problem is :
> 1 The value of timer.expiresAt is set by virTimeMillisNowRaw. virTimeMillisNowRaw is effectable by settimeofday(),
>   bacause it uses CLOCK_REALTIME to get time.
> 2 If we change the system time to a time long before now, after that timer.expiresAt has been set. timer.expiresAt 
>   is not affected, while virEventPollDispatchTimeouts is. 
>   Suppose it's now May 12th,  and we set it to 10th, then the expiresAt is 12th, and the time virEventPollDispatchTimeouts
>   got is 10th.
>         if (eventLoop.timeouts[i].expiresAt <= (now+20)) { // expiresAt will not be less than now until 2 days later.
> *Solution(not good enough)*:
> 1 change the clock mode in virTimeMillisNowRaw from REALTIME to MONOTONIC, which would not be affected by 
> settimeofday(). 
> 2 add the time got from clock_gettime(*MONOTONIC*) with the system-start-time from epoch, making it equal to the value got from REALTIME.
> 3 As that the timestamp of the log message should follow system time, so we keep it to REALTIME as before.
> However, there's still problems:
> 1 pthread_cond_wait() gets time with REALTIME mode. When we change system time, pthread_cond_wait() may still be affected.
> So, Is there any other better solution? thanks in advance.

Simply don't change the system time by massive deltas. Libvirt is not going
to be the only app to be affected. As you mention it is going to hit the
pthread_cond_wait() call which will likely affect pretty much every single
non-trivial process running on the system. I'd expect other apps have much
the same problem with calculating poll sleeps too.

If you need to massively change the system time this should be done at
single user mode, or do a reboot. Once a system is running it should be
kept synced with NTPD which will only ever change system time in very
small increments and so once cause thsi problem.

|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

More information about the libvir-list mailing list