[libvirt] feature suggestion: migration network

Fri Jan 11 08:31:48 UTC 2013

On 01/10/2013 08:00 PM, Dan Kenigsberg wrote:
> On Thu, Jan 10, 2013 at 10:45:42AM +0800, Mark Wu wrote:
>> On 01/08/2013 10:46 PM, Yaniv Kaul wrote:
>>> On 08/01/13 15:04, Dan Kenigsberg wrote:
>>>> There's talk about this for ages, so it's time to have proper discussion
>>>> and a feature page about it: let us have a "migration" network role, and
>>>> use such networks to carry migration data
>>>>
>>>> When Engine requests to migrate a VM from one node to another, the VM
>>>> state (Bios, IO devices, RAM) is transferred over a TCP/IP connection
>>>> that is opened from the source qemu process to the destination qemu.
>>>> Currently, destination qemu listens for the incoming connection on the
>>>> management IP address of the destination host. This has serious
>>>> downsides: a "migration storm" may choke the destination's management
>>>> interface; migration is plaintext and ovirtmgmt includes Engine which
>>>> sits may sit the node cluster.
>>>>
>>>> With this feature, a cluster administrator may grant the "migration"
>>>> role to one of the cluster networks. Engine would use that network's IP
>>>> address on the destination host when it requests a migration of a VM.
>>>> With proper network setup, migration data would be separated to that
>>>> network.
>>>>
>>>> === Benefit to oVirt ===
>>>> * Users would be able to define and dedicate a separate network for
>>>>    migration. Users that need quick migration would use nics with high
>>>>    bandwidth. Users who want to cap the bandwidth consumed by migration
>>>>    could define a migration network over nics with bandwidth limitation.
>>>> * Migration data can be limited to a separate network, that has no
>>>>    layer-2 access from Engine
>>>>
>>>> === Vdsm ===
>>>> The "migrate" verb should be extended with an additional parameter,
>>>> specifying the address that the remote qemu process should listen on. A
>>>> new argument is to be added to the currently-defined migration
>>>> arguments:
>>>> * vmId: UUID
>>>> * dst: management address of destination host
>>>> * dstparams: hibernation volumes definition
>>>> * mode: migration/hibernation
>>>> * method: rotten legacy
>>>> * ''New'': migration uri, according to
>>>> http://libvirt.org/html/libvirt-libvirt.html#virDomainMigrateToURI2
>>>> such as tcp://<ip of migration network on remote node>
>>>>
>>>> === Engine ===
>>>> As usual, complexity lies here, and several changes are required:
>>>>
>>>> 1. Network definition.
>>>> 1.1 A new network role - not unlike "display network" should be
>>>>      added.Only one migration network should be defined on a cluster.
>>>> 1.2 If none is defined, the legacy "use ovirtmgmt for migration"
>>>>      behavior would apply.
>>>> 1.3 A migration network is more likely to be a ''required'' network, but
>>>>      a user may opt for non-required. He may face unpleasant
>>>> surprises if he
>>>>      wants to migrate his machine, but no candidate host has the network
>>>>      available.
>>>> 1.4 The "migration" role can be granted or taken on-the-fly, when hosts
>>>>      are active, as long as there are no currently-migrating VMs.
>>>>
>>>> 2. Scheduler
>>>> 2.1 when deciding which host should be used for automatic
>>>>      migration, take into account the existence and availability of the
>>>>      migration network on the destination host.
>>>> 2.2 For manual migration, let user migrate a VM to a host with no
>>>>      migration network - if the admin wants to keep jamming the
>>>>      management network with migration traffic, let her.
>>>>
>>>> 3. VdsBroker migration verb.
>>>> 3.1 For the a modern cluster level, with migration network defined on
>>>>      the destination host, an additional ''miguri'' parameter
>>>> should be added
>>>>      to the "migrate" command
>>>>
>>>> _______________________________________________
>>>> Arch mailing list
>>>> Arch at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/arch
>>> How is the authentication of the peers handled? Do we need a cert
>>> per each source/destination logical interface?
>>> Y.
>> In my understanding, using a separate migration network doesn't
>> change the current peers authentication.  We still use the URI
>> ''qemu+tls://remoeHost/system' to connect the target libvirt service
>> if ssl enabled,  and the remote host should be the ip address of
>> management interface. But we can choose other interfaces except the
>> manage interface to transport the migration data. We just change the
>> migrateURI, so the current authentication mechanism should still
>> work for this new feature.
> vdsm-vdsm and libvirt-libvirt communication is authenticated, but I am
> not sure at all that qemu-qemu communication is.
AFAIK,  there's not authentication between qemu-qemu communications.
>
> After qemu is sprung up on the destination with
>      -incoming <some ip>:<some port> , anything with access to that
> address could hijack the process. Our migrateURI starts with "tcp://"
Dest libvirtd starts qemu with listening on that address/port, and
qemu will close the listening socket on <some ip>:<some port> as soon as 
the src host
connects to it successfully.  So it just listens in a  very small 
window, but still possible to be
hijacked. We could use iptables to only open the access to src host 
dynamically on migration for secure.
> with all the consequences of this. That a good reason to make sure
> <some ip> has as limited access as possible
>
> But maybe I'm wrong here, and libvir-list can show me the light.
>
> Dan.
>