[libvirt] [PATCH] migration: Usable time statistics without requiring NTP

Michal Privoznik mprivozn at redhat.com
Fri Apr 24 09:53:46 UTC 2015


On 23.04.2015 15:25, Jiri Denemark wrote:
> On Thu, Apr 23, 2015 at 11:40:11 +0200, Michal Privoznik wrote:
>> On 23.04.2015 11:18, Jiri Denemark wrote:
>>> virDomainGetJobStats is able to report statistics of a completed
>>> migration, however to get usable downtime and total time statistics both
>>> hosts have to keep synchronized time. To provide at least some
>>> estimation of the times even when NTP daemons are not running on both
>>> hosts we can just ignore the time needed to transfer a migration cookie
>>> to the destination host. The result will be also inaccurate but a bit
>>> more predictable. The total/down time will just be at least what we
>>> report.
>>>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1213434
>>> Signed-off-by: Jiri Denemark <jdenemar at redhat.com>
>>> ---
>>>  include/libvirt/libvirt-domain.h | 23 ++++++++++++++++++++++-
>>>  src/qemu/qemu_domain.c           | 15 +++++++++++++++
>>>  src/qemu/qemu_domain.h           |  9 +++++++++
>>>  src/qemu/qemu_migration.c        | 26 +++++++++++++-------------
>>>  tools/virsh-domain.c             | 16 ++++++++++++++++
>>>  5 files changed, 75 insertions(+), 14 deletions(-)
>>>
>>
>>> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
>>> index 1da687c..4b3143f 100644
>>> --- a/src/qemu/qemu_migration.c
>>> +++ b/src/qemu/qemu_migration.c
>>
>>> @@ -3438,18 +3443,9 @@ qemuMigrationConfirmPhase(virQEMUDriverPtr driver,
>>>      /* Update total times with the values sent by the destination daemon */
>>>      if (mig->jobInfo) {
>>>          qemuDomainObjPrivatePtr priv = vm->privateData;
>>> -        if (priv->job.completed) {
>>> -            qemuDomainJobInfoPtr jobInfo = priv->job.completed;
>>> -            if (mig->jobInfo->status.downtime_set) {
>>> -                jobInfo->status.downtime = mig->jobInfo->status.downtime;
>>> -                jobInfo->status.downtime_set = true;
>>> -            }
>>> -            if (mig->jobInfo->timeElapsed)
>>> -                jobInfo->timeElapsed = mig->jobInfo->timeElapsed;
>>> -        } else {
>>> -            priv->job.completed = mig->jobInfo;
>>> -            mig->jobInfo = NULL;
>>> -        }
>>> +        VIR_FREE(priv->job.completed);
>>> +        priv->job.completed = mig->jobInfo;
>>> +        mig->jobInfo = NULL;
>>>      }
>>>  
>>>      if (flags & VIR_MIGRATE_OFFLINE)
>>> @@ -4041,6 +4037,7 @@ qemuMigrationRun(virQEMUDriverPtr driver,
>>>      if (priv->job.completed) {
>>>          qemuDomainJobInfoUpdateTime(priv->job.completed);
>>>          qemuDomainJobInfoUpdateDowntime(priv->job.completed);
>>> +        ignore_value(virTimeMillisNow(&priv->job.completed->sent));
>>
>> So here you mark the time of start of the migration (on the source)...
> 
> This is actually the end of migration, i.e., just be for we sent the
> cookie to the destination.
> 
>>
>>>      }
>>>  
>>>      if (priv->job.current->type == VIR_DOMAIN_JOB_UNBOUNDED)
>>> @@ -5164,8 +5161,11 @@ qemuMigrationFinish(virQEMUDriverPtr driver,
>>>          }
>>>  
>>>          if (mig->jobInfo) {
>>> -            priv->job.completed = mig->jobInfo;
>>> +            qemuDomainJobInfoPtr jobInfo = mig->jobInfo;
>>> +            priv->job.completed = jobInfo;
>>>              mig->jobInfo = NULL;
>>> +            if (jobInfo->sent && virTimeMillisNow(&jobInfo->received) == 0)
>>> +                jobInfo->timeDelta = jobInfo->received - jobInfo->sent;
>>
>> ... and here, once the migration is finished, you compute the time
>> difference.
> 
> And here, when we get the cookie, we compute a difference between now
> and the time stored in "sent". Which means timeDelta will contain the
> difference between now on the source and now on the destination plus any
> time required to transfer and process the cookie.
> 
>> What I am worried about is, what if time on both machines is
>> so off that this value makes no sense (e.g. timeDelta would be a
>> negative number)?
> 
> timeDelta is signed and can of course be negative (if the destination is
> behind the source) and there's no problem with this. The only problem
> would be when the difference is so large it would overflow, but this
> means the difference would have to be > 2^63, which is something like
> 292 million years, which cannot even be represented by time_t. I don't
> think we need to worry about this.
> 
>> Moreover, don't we have it as a migration prerequisite that time on the
>> both machines needs to be synchronized?
> 
> No, there's no such requirement.

Ah. Okay then. ACK. Although it may be nice if we print out the delta in
virsh unconditionally (even if it's zero) so that we can distinguish if
the information is not there or if it's zero.

Michal




More information about the libvir-list mailing list