[libvirt] [PATCH]: Secure migration support for KVM

Thu May 21 10:45:54 UTC 2009

On Tue, May 12, 2009 at 01:55:47PM +0200, Chris Lalancette wrote:
> All,
>      Attached is the secure migration patch for libvirt.  What this patch
> implements is a new remote RPC call for secure migration.  On the source of the
> migration, we do a migration from the qemu process to the libvirtd on localhost.
>  As each read() in libvirtd completes, it issues an RPC message to the remote
> libvirtd, using the standard libvirt RPC mechanisms.  On the destination, we do
> essentially the mirror; the libvirtd accepts the data from RPC, and then writes
> it to a qemu container process listening on localhost.
> 
> In order to actually use this, the command-line is pretty complex.  If you want
> to use standard live migration, the command-line looks something like:
> 
> # virsh -c qemu+tls://source.example.org/system migrate --live guest
> qemu+tls://dest.example.org/system
> 
> This says to a live migration of "guest" from "source.example.org" to
> "dest.example.org", connecting to each of the remote libvirtd via TLS.  Note
> that in this model, the virsh process connects to the remote libvirtd's via the
> -c argument (source) and the destination argument (dest).
> 
> To do secure live migration, this becomes:
> 
> # virsh -c qemu+tls://source.example.org/system migrate --live --secure guest
> qemu+tls://dest.example.org/system qemu+tls://dest.example.org/system

Well that could clearly be simplified by virsh to match the standard migration
syntax. eg, if --secure is given, then automatically set the migrate URI to
match the destination host URI, since 99% of the time they'll be identical.
Or even do this in the virDomainMigrate() API if URI is null & VIR_MIGRATE_SECURE
is set.

>> diff --git a/include/libvirt/libvirt.h b/include/libvirt/libvirt.h
> index 30f559d..b9362d5 100644
> --- a/include/libvirt/libvirt.h
> +++ b/include/libvirt/libvirt.h
> @@ -319,6 +319,7 @@ typedef virDomainInterfaceStatsStruct *virDomainInterfaceStatsPtr;
>  /* Domain migration flags. */
>  typedef enum {
>    VIR_MIGRATE_LIVE              = 1, /* live migration */
> +  VIR_MIGRATE_SECURE            = 2, /* secure migration */
>  } virDomainMigrateFlags;
>  
>  /* Domain migration. */

One thing that occurs to me is that perhaps we're wrong in refering to this
as secure migration. a) libvirtd <-> libvirtd is not guarenteed to be secure
if the administrator has turned off authentication on the TCP socket (stupid
on the admins part, but possible)  b) it implies that migration without this
flag is not secure. In fact native migration can be secure depending on the
underlying driver - eg it is possible to configure latest XenD  to use an
SSL socket for migrating.

Thus I reckon we should change this to be called   VIR_MIGRATE_TUNNELLED
or VIR_MIGRATE_PROXIED, because the flag is effectively saying tunnel the
migration data via libvirtd.

It is interesting for an application to know whether the migration is going
to be secure or not though.  We may wish to expand the capabilities XML in
the driver so that for each URI declare, it has a secure='yes|no' flag to
tell apps whether that URI will be secure as per current hypervisor config.

It may also be desirable to add a flag  VIR_MIGRATE_SECURE, which means
'require data security' for migration. If the underlying data transport
was determined not to be secure by libvirt, then it would return an error
and not attempt to start migration.

> @@ -2734,6 +2735,13 @@ virDomainMigrate (virDomainPtr domain,
>          goto error;
>      }
>  
> +    if ((flags & VIR_MIGRATE_SECURE) && uri == NULL) {
> +        /* if you are doing a secure migration, you *must* also pass a uri */
> +        virLibConnError(conn, VIR_ERR_INVALID_ARG,
> +                        _("requested SECURE migration, but no URI passed"));
> +        goto error;
> +    }

I'm wondering why you can't just to  virConnectGetURI(dconn) on the
2nd libvirt connection which refers to the destinaton ?

>  /*
>   * Not for public use.  This function is part of the internal
> + * implementation of secure migration in the remote case.
> + */
> +int virConnectSecureMigrationData(virConnectPtr conn,
> +                                  const char *cookie,
> +                                  int cookielen,
> +                                  char *data,
> +                                  int datalen)
> +{
> +    virResetLastError();
> +
> +    if (conn->flags & VIR_CONNECT_RO) {
> +        virLibConnError(conn, VIR_ERR_OPERATION_DENIED, __FUNCTION__);
> +        goto error;
> +    }
> +
> +    if (conn->driver->secureMigrationData) {
> +        int ret;
> +        ret = conn->driver->secureMigrationData (conn, cookie, cookielen, data, datalen);
> +        if (ret < 0)
> +            goto error;
> +        return ret;
> +    }
> +
> +    virLibConnError (conn, VIR_ERR_NO_SUPPORT, __FUNCTION__);
> +
> +error:
> +    /* Copy to connection error object for back compatability */
> +    virSetConnError(conn);
> +    return -1;
> +}

Having thought about this for a long time now, I'm a little concerned
about the way the driver APIs are structured. 

On the destination host, for a migration operation the QEMU driver 
is doing

    virConnecMigratePrepare()
    repeat 'n' times virConnectSecureMigrationData()
    virConnnectMigrateFinish()

So the driver has to maintain state for across multiple method calls
for an arbitrary amount of time, with no way to abort the migration
operation & return an error if the client (maliciously) stops sending
data via the virConnectSecureMigrationData(). It can register a timer
to abort it after some amount of time, but it can't get an error back
to the client this way, or cause the libvirtd to drop the client 
connection easily. 

I'm trying to think up a way to combine this whole sequence of steps
into a single API call from the drivers point of view, somehow providing
it with a data stream from which it can read the incoming data. I'm
thinking of a API call 

  virConnectMigrateStream(virConnectPtr conn, const char *xml, virInputStreamPtr is)

where virInputStreamPtr provides some kind of callback, that can be
use by the driver to read more data off the wire from the client.
This would mean the migration state has a lifetime bounded to the
execution of this 1 API call (though it would entail multiple RPC
messages on the wire).

>  static int
>  qemudDomainMigratePrepare2 (virConnectPtr dconn,
> -                            char **cookie ATTRIBUTE_UNUSED,
> -                            int *cookielen ATTRIBUTE_UNUSED,
> +                            char **cookie,
> +                            int *cookielen,
>                              const char *uri_in,
>                              char **uri_out,
> -                            unsigned long flags ATTRIBUTE_UNUSED,
> +                            unsigned long flags,
>                              const char *dname,
>                              unsigned long resource ATTRIBUTE_UNUSED,
>                              const char *dom_xml)
> @@ -4748,7 +4751,9 @@ qemudDomainMigratePrepare2 (virConnectPtr dconn,
>      char migrateFrom [64];
>      const char *p;
>      virDomainEventPtr event = NULL;
> -    int ret = -1;;
> +    int ret = -1;
> +    struct secure_mig *secureMigData = NULL;
> +    struct sockaddr_in a;

> +        *cookie = (char *)secureMigData;
> +        *cookielen = sizeof(*secureMigData);
> +    }

Sending a pointer to our data structure back to the client app....

>  static virDomainPtr
>  qemudDomainMigrateFinish2 (virConnectPtr dconn,
>                             const char *dname,
> -                           const char *cookie ATTRIBUTE_UNUSED,
> -                           int cookielen ATTRIBUTE_UNUSED,
> +                           const char *cookie,
> +                           int cookielen,
>                             const char *uri ATTRIBUTE_UNUSED,
> -                           unsigned long flags ATTRIBUTE_UNUSED,
> +                           unsigned long flags,
>                             int retcode)
>  {
>      struct qemud_driver *driver = dconn->privateData;
> @@ -5034,6 +5270,7 @@ qemudDomainMigrateFinish2 (virConnectPtr dconn,
>      virDomainPtr dom = NULL;
>      virDomainEventPtr event = NULL;
>      char *info = NULL;
> +    struct secure_mig *secureMigData;
>  
>      qemuDriverLock(driver);
>      vm = virDomainFindByName(&driver->domains, dname);
> @@ -5043,6 +5280,20 @@ qemudDomainMigrateFinish2 (virConnectPtr dconn,
>          goto cleanup;
>      }
>  
> +    if (flags & VIR_MIGRATE_SECURE) {
> +        /* in this case, we need to close out the cookie resources */
> +        if (cookie == NULL || cookielen != sizeof(struct secure_mig)) {
> +            qemudReportError (dconn, NULL, NULL, VIR_ERR_INVALID_ARG,
> +                              _("Bad size for secure migration cookie"));
> +            goto cleanup;
> +        }
> +
> +        secureMigData = (struct secure_mig *)cookie;
> +
> +        close(secureMigData->socket);
> +        VIR_FREE(secureMigData);
> +    }

And now trusting that the client sent us back the same pointer in this
method call :-( 

Daniel
-- 
|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|