[libvirt] [PATCH v2] virsh: add keepalive protocol in virsh
Jiri Denemark
jdenemar at redhat.com
Fri Jun 8 07:46:08 UTC 2012
On Thu, Jun 07, 2012 at 12:03:02 -0600, Eric Blake wrote:
> On 06/05/2012 02:36 AM, Guannan Ren wrote:
> > Bugzilla:https://bugzilla.redhat.com/show_bug.cgi?id=822839
> > add two general virsh options to support keepalive message protocol
> >
> > -i | --keepalive_interval interval time value (default 5 seconds)
> > -n | --keepalive_count number of heartbeats (default 5 times)
> >
> > For non-p2p migration, start keepalive for remote driver to
> > determine the status of network connection, aborting migrating job
> > after defined amount of interval time.
> > ---
> > tools/virsh.c | 88 +++++++++++++++++++++++++++++++++++++++++++++-----------
> > 1 files changed, 70 insertions(+), 18 deletions(-)
>
> Missing a counterpart documentation in virsh.pod.
>
> > @@ -7213,6 +7219,12 @@ doMigrate (void *opaque)
> > dconn = virConnectOpenAuth (desturi, virConnectAuthPtrDefault, 0);
> > if (!dconn) goto out;
> >
> > + data->dconn = dconn;
> > + if (virConnectSetKeepAlive(dconn,
> > + ctl->keepalive_interval,
> > + ctl->keepalive_count) < 0)
> > + vshDebug(ctl, VSH_ERR_WARNING, "migrate: Failed to start keepalive\n");
>
> This makes it impossible to migrate to an older server that lacks
> virConnectSetKeepAlive() support.
virConnectSetKeepAlive returns 1 when remote party does not support keepalive,
which means the above code is perfectly compatible with older servers.
> You need to avoid doing this if keepalive_interval and/or keepalive_count is
> set to a value that avoids keepalive. I think that also means you need to
> distinguish between a normal default of 5 seconds with new servers but
> unspecified by the user (where we just gracefully continue without
> keepalive), compared to an explicit user request of 5 seconds (must fail if
> the keepalive request cannot be honored), and compared to an explicit user
> request of no keepalive.
But I agree that we need to distinguish between default keepalive settings and
those explicitly requested.
> > @@ -7329,6 +7342,13 @@ repoll:
> > goto cleanup;
> > }
> >
> > + if (data->dconn && virConnectIsAlive(data->dconn) <= 0) {
> > + virDomainAbortJob(dom);
> > + vshError(ctl, "%s",
> > + _("Lost connection to destination host"));
> > + goto cleanup;
> > + }
>
> Again, need to be careful of older servers that lack virConnectIsAlive()
> support.
This doesn't call to the server at all. However, virConnectIsAlive returning
-1 doesn't mean the connection is closed, especially if it sets the error to
VIR_ERR_NO_SUPPORT.
> > @@ -18681,6 +18703,18 @@ vshCommandRun(vshControl *ctl, const vshCmd *cmd)
> > if (enable_timing)
> > GETTIMEOFDAY(&before);
> >
> > + /* start keepalive for remote driver */
> > + driver = virConnectGetType(ctl->conn);
> > + if (STREQ_NULLABLE(driver, "QEMU") ||
> > + STREQ_NULLABLE(driver, "xenlight") ||
> > + STREQ_NULLABLE(driver, "UML") ||
> > + STREQ_NULLABLE(driver, "LXC") ||
> > + STREQ_NULLABLE(driver, "remote"))
>
> Why are you limiting this to particular hypervisors? That feels wrong.
> Rather, you should blindly attempt it, and gracefully detect when it is
> unsupported.
Yes, and another issue is that vshCommandRun is a bad place to do this.
Keepalive should be enabled just after opening a connection, i.e., in
vshReconnect, cmdConnect, and vshInit (in addition to doMigrate, where you
correctly do so). I think virsh would benefit from a new wrapper for
virConnectOpenAuth that would also deal with keepalive and which would be
called from all four places.
Jirka
More information about the libvir-list
mailing list