[libvirt] [openstack-dev] [nova] live-snapshot/cloning of virtual machines

Fri Aug 16 15:09:06 UTC 2013

> On Fri, Aug 16, 2013 at 11:05:19AM +0100, Daniel P. Berrange wrote:
>> On Wed, Aug 14, 2013 at 04:53:01PM -0700, Vishvananda Ishaya wrote:
>>> Hi Everyone,
>>> 
>>> I have been trying for some time to get the code for the live-snapshot blueprint[1]
>>> in. Going through the review process for the rpc and interface code[2] was easy. I
>>> suspect the api-extension code[3] will also be relatively trivial to get in. The
>>> main concern is with the libvirt driver implementation[4]. I'd like to discuss the
>>> concerns and see if we can make some progress.
>>> 
>>> Short Summary (tl;dr)
>>> =====================
>>> 
>>> I propose we merge live-cloning as an experimental feature for havanna and have the
>>> api extension disabled by default.
>>> 
>>> Overview
>>> ========
>>> 
>>> First of all, let me express the value of live snapshoting. The
>>> slowest part of the vm provisioning process is generally booting
>>> of the OS.
> 
> Like Dan I'm dubious about this whole plan.  But this ^^ statement in
> particular.  I would like to see hard data to back this up.

What we need to keep in mind is that "boot" is a small part of the picture, at least "boot" as commonly referred to in Linux.

Consider a web sphere-like Java bundle of code. These things take a while to load. JiT-ed methods provide a tremendous performance boost. Nevermind if the the server constructs secondary indices to perform fast lookups of data.

That is just Linux. Windows is well known for pounding storage fabrics with thousands of small reads during boot storms. Certainly a boot Windows sequence has baked in a lot of service startup sequences that prime a lot of memory content for performance objectives.

Boot here means "ready to rock-n-roll", not "Cirros is up."

We have live deployments that are based on bypassing the entire *application startup* sequence and have a server ready to provide high-performance responses to queries once spawned from a live saved image.

> 
> You should be able to boot an OS pretty quickly, and furthermore it's
> (a) much safer for all the reasons Dan outlines, and (b) improvements
> that you make to boot times help everyone.
> 
> [...]
>>> 2. Security Concerns
>>> ====================
>>> 
>>> There are a number of security issues with loading state from another vm. Here is a
>>> short list of things that need to be done just to make a cloned vm usable:
>>> 
>>> a) mac address needs to be recreated
>>> b) entropy pool needs to be reset
>>> c) host name must be reset
>>> d) host keys bust be regenerated
>>> 
>>> There are others, and trying to clone a running application as well may expose other
>>> sensitive data, especially if users are snaphsoting vms and making them public.
> 
> Are we talking about cloning VMs that you already trust, or cloning
> random VMs and allowing random other users to use them?  These would
> lead to very different solutions.  In the first case, you only care
> about correctness, not security.  In the second case, you care about
> security as well as correctness.

Case number one.

The correctness issues are a hard problem, and a particularly hard one in Windows, but it is pragmatically solvable.

For a common scenario in Linux, renewing dhcp leases and leveling your entropy pool are what you need.

> 
> I highly doubt the second case is possible because scrubbing the disk
> is going to take far too long for any supposed time-saving to matter.

That would be very counter-productive, so yes, focusing on the first case.
> 
> As Dan says, even the first case is dubious because it won't be correct.
> 
>> The libguestfs project provide tools to perform offline cloning of
>> VM disk images.  Its virt-sysprep knows how to delete alot (but by
>> no means all possible) sensitive file data for common Linux &
>> Windows OS. It still has to be combined with use of the
>> virt-sparsify tool though, to ensure the deleted data is actually
>> purged from the VM disk image as well as the filesystem, by
>> releasing all unused VM disk sectors back to the host storage (and
>> not all storage supports that).
> 
> Links to the tools that Dan mentions:
> 
> http://libguestfs.org/virt-sysprep.1.html
> http://libguestfs.org/virt-sparsify.1.html

Virt-sparsify is not strictly relevant here. The disk side of live images is carried out with qcow2.

Virt-sysprep is great work and highly relevant.

But virt-sysprep allows us to see the argument in a different light. Have you noticed nova does not run virt-sysprep before booting an ephemeral instance from an image? (AFAIK, could be wrong, not even regenerating host ssh keys is part of the assured workflow). Furthermore, one can create arbitrary (cold, non-live) images at any time, from live instances

This isn't necessarily wrong. It underpins massive deployments, it pragmatically adds value. The fundamental semantics at play with live-instances are the same: know what you are doing, ephemeral instances, bound to your tenant.

So as long as we take care of not weakening cryptographic foundations and correctly reconfiguring the identity, all the principles above still apply: know what you are doing, ephemeral instances, bound to your tenant.

We are working on proof-of-concept using the qemu guest agent to undertake all these tasks. It is PoC to show it's doable. It's neither clean nor asking for mainline merge (at the moment). It's not the only solution for the guest side problem.

An one final very important note for the libvirt list and those who may be tuning in just now: no changes to libvirt are being asked for here. This is a nova-side tech preview feature.

Best,
Andres

> 
> Note these tools can only be used on offline machines.
> 
> Rich.
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> virt-top is 'top' for virtual machines.  Tiny program with many
> powerful monitoring features, net stats, disk stats, logging, etc.
> http://people.redhat.com/~rjones/virt-top