[Linux-cluster] Live migration of VMs instead of relocation

Lon Hohberger lhh at redhat.com
Fri Nov 30 10:18:26 UTC 2007


On Fri, 2007-11-30 at 11:23 +0100, jr wrote:
> Hello everybody,
> i was wondering if i could somehow get rgmanager to use live migration
> of vms when the prefered member of a failover domain for a certain vm
> service comes up again after a failure. the way it is right now is that
> if rgmanager detects a failure of a node, the virtual machine gets taken
> over by a different node with a lower priority. as soon as i the primary
> node comes back into the cluster, rgmanager relocated the vm to that
> node, which means shutting it down and starting it on that node again.
> as i managed to get live migration working in the cluster, i'd like to
> have rgmanager make use of that.
> is there a known configuration for this?
> best regards,

5.1(+updates) does (or should do?) "migrate-or-nothing" when relocating
VMs back to the preferred node.  That is, if it can't do a migrate,
leave the VM where it is.

The caveat is of course that the VM is at the top level with no parent
node / no children in the resource tree (i.e. it shouldn't be a child of
a <service>), like so:

  <rm>
    <resources/>
    <service ...>
      <child1 .../>
    </service>
    <vm />
  </rm>

Parent/child dependencies aren't allowed because of the stop/start
nature of other resources: To stop a node, its children must be stopped,
but to start a node, its parents must be started.

Note that currently as of 5.1, it's pause-migration, not live-migration
- to change this, you need to edit vm.sh and change the "xm migrate ..."
command line to "xm migrate -l ...".

The upside of pause-migration is that it's a simpler and faster overall
operation to transfer the VM from one machine to another.  The down side
is of course that your downtime is several seconds during migrate rather
than the typical <1 sec for live-migration.

We plan to switch to live migrate as default instead of pause-migrate
(with the ability to select pause migration if desired) in the next
update.  Actually the change is in CVS if you don't want to hax around
with the resource agent:

http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/cluster/rgmanager/src/resources/vm.sh?rev=1.1.2.9&content-type=text/plain&cvsroot=cluster&only_with_tag=RHEL5

... hasn't had a lot of testing though. :)

-- Lon




More information about the Linux-cluster mailing list