[libvirt] [PATCH] migration: Usable time statistics without requiring NTP
Jiri Denemark
jdenemar at redhat.com
Fri Apr 24 13:17:13 UTC 2015
On Fri, Apr 24, 2015 at 11:53:46 +0200, Michal Privoznik wrote:
> On 23.04.2015 15:25, Jiri Denemark wrote:
> > On Thu, Apr 23, 2015 at 11:40:11 +0200, Michal Privoznik wrote:
> >> On 23.04.2015 11:18, Jiri Denemark wrote:
> >>> virDomainGetJobStats is able to report statistics of a completed
> >>> migration, however to get usable downtime and total time statistics both
> >>> hosts have to keep synchronized time. To provide at least some
> >>> estimation of the times even when NTP daemons are not running on both
> >>> hosts we can just ignore the time needed to transfer a migration cookie
> >>> to the destination host. The result will be also inaccurate but a bit
> >>> more predictable. The total/down time will just be at least what we
> >>> report.
> >>>
> >>> https://bugzilla.redhat.com/show_bug.cgi?id=1213434
> >>> Signed-off-by: Jiri Denemark <jdenemar at redhat.com>
> >>> ---
> >>> include/libvirt/libvirt-domain.h | 23 ++++++++++++++++++++++-
> >>> src/qemu/qemu_domain.c | 15 +++++++++++++++
> >>> src/qemu/qemu_domain.h | 9 +++++++++
> >>> src/qemu/qemu_migration.c | 26 +++++++++++++-------------
> >>> tools/virsh-domain.c | 16 ++++++++++++++++
> >>> 5 files changed, 75 insertions(+), 14 deletions(-)
> >>>
> >>
> >>> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
> >>> index 1da687c..4b3143f 100644
> >>> --- a/src/qemu/qemu_migration.c
> >>> +++ b/src/qemu/qemu_migration.c
> >>
> >>> @@ -3438,18 +3443,9 @@ qemuMigrationConfirmPhase(virQEMUDriverPtr driver,
> >>> /* Update total times with the values sent by the destination daemon */
> >>> if (mig->jobInfo) {
> >>> qemuDomainObjPrivatePtr priv = vm->privateData;
> >>> - if (priv->job.completed) {
> >>> - qemuDomainJobInfoPtr jobInfo = priv->job.completed;
> >>> - if (mig->jobInfo->status.downtime_set) {
> >>> - jobInfo->status.downtime = mig->jobInfo->status.downtime;
> >>> - jobInfo->status.downtime_set = true;
> >>> - }
> >>> - if (mig->jobInfo->timeElapsed)
> >>> - jobInfo->timeElapsed = mig->jobInfo->timeElapsed;
> >>> - } else {
> >>> - priv->job.completed = mig->jobInfo;
> >>> - mig->jobInfo = NULL;
> >>> - }
> >>> + VIR_FREE(priv->job.completed);
> >>> + priv->job.completed = mig->jobInfo;
> >>> + mig->jobInfo = NULL;
> >>> }
> >>>
> >>> if (flags & VIR_MIGRATE_OFFLINE)
> >>> @@ -4041,6 +4037,7 @@ qemuMigrationRun(virQEMUDriverPtr driver,
> >>> if (priv->job.completed) {
> >>> qemuDomainJobInfoUpdateTime(priv->job.completed);
> >>> qemuDomainJobInfoUpdateDowntime(priv->job.completed);
> >>> + ignore_value(virTimeMillisNow(&priv->job.completed->sent));
> >>
> >> So here you mark the time of start of the migration (on the source)...
> >
> > This is actually the end of migration, i.e., just be for we sent the
> > cookie to the destination.
> >
> >>
> >>> }
> >>>
> >>> if (priv->job.current->type == VIR_DOMAIN_JOB_UNBOUNDED)
> >>> @@ -5164,8 +5161,11 @@ qemuMigrationFinish(virQEMUDriverPtr driver,
> >>> }
> >>>
> >>> if (mig->jobInfo) {
> >>> - priv->job.completed = mig->jobInfo;
> >>> + qemuDomainJobInfoPtr jobInfo = mig->jobInfo;
> >>> + priv->job.completed = jobInfo;
> >>> mig->jobInfo = NULL;
> >>> + if (jobInfo->sent && virTimeMillisNow(&jobInfo->received) == 0)
> >>> + jobInfo->timeDelta = jobInfo->received - jobInfo->sent;
> >>
> >> ... and here, once the migration is finished, you compute the time
> >> difference.
> >
> > And here, when we get the cookie, we compute a difference between now
> > and the time stored in "sent". Which means timeDelta will contain the
> > difference between now on the source and now on the destination plus any
> > time required to transfer and process the cookie.
> >
> >> What I am worried about is, what if time on both machines is
> >> so off that this value makes no sense (e.g. timeDelta would be a
> >> negative number)?
> >
> > timeDelta is signed and can of course be negative (if the destination is
> > behind the source) and there's no problem with this. The only problem
> > would be when the difference is so large it would overflow, but this
> > means the difference would have to be > 2^63, which is something like
> > 292 million years, which cannot even be represented by time_t. I don't
> > think we need to worry about this.
> >
> >> Moreover, don't we have it as a migration prerequisite that time on the
> >> both machines needs to be synchronized?
> >
> > No, there's no such requirement.
>
> Ah. Okay then. ACK. Although it may be nice if we print out the delta in
> virsh unconditionally (even if it's zero) so that we can distinguish if
> the information is not there or if it's zero.
Done and pushed, thanks.
Jirka
More information about the libvir-list
mailing list