[libvirt] Notes: Non-shared storage live migration w/ active blockcommit

Wed Oct 8 09:07:11 UTC 2014

On Tue, Oct 07, 2014 at 05:35:00PM -0600, Eric Blake wrote:
> On 09/25/2014 08:26 AM, Kashyap Chamarthy wrote:
> > This notes is based on an IRC conversation with Eric Blake, to have
> > efficient non-shared storage live migration. Thought I'd post my notes
> > here before I forget. Please review and spot if there are any
> > inaccuracies.
> > 
> > Procedure
> > ---------
> > 
> > (1) Starting from disk A, create a snapshot A <- A':
> >         
> >     $ virsh snapshot-create-as \
> >         --domain f20vm snap1 snap1-desc \
> >         --diskspec hda,file=/export/vmimages/A'.qcow2 \
> >         --disk-only --atomic
> 
> If you are using this snapshot only for the side-effect of growing the
> chain, you can add --no-metadata here instead of deleting the snapshot
> later when it gets invalidated [1].  Of course, if you pass
> --no-metadata, the snapshot name (snap1) and description (snap1-desc)
> are no longer important.

Right, until proper cleaner revert to external snapshot mechanisms are
in place, I should make a habit of passing '--no-metadata' when creating
external snapshots for the above reason (as I usually do end up deleting
the related libvirt metadata as part of cleanup).

> > 
> > (2) Background copy of A to B:
> > 
> >     $ virsh blockcopy \
> >         --domain vm1 vda /export/vmimages/B.qcow2 \
> >         --wait --verbose --shallow \
> >         --finish
> 
> This step is not quite right.  You are asking for a shallow copy of the
> current file for disk 'vda' (that is, A'.qcow2).  But that is NOT the
> same as the base A image.

Oh right, thanks for catching this mistake.

> For this step, libvirt does not yet have an easy way to access the
> contents of a backing chain of a live domain; you CAN use 'virsh
> vol-*' commands to do a background copy from storage pools, but it may
> be easier to just resort to normal file system tools:
> 
> cp /export/vmimages/A.qcow2 /export/vmimages/B.qcow2

Yeah, simple and less commands to type too.

> or even rely on storage-array-specific commands to set up a trivial
> clone with no real time overhead (for example, some iscsi storage arrays
> allow efficient copy-on-write cloning of storage volumes by creating a
> new name that shares the same original contents of A.qcow2 as its
> starting point; and since we are about to delete A.qcow2 later on, we
> never need any actual data copying).
> 
> > 
> > (3) Create an empty B' with backing file B:
> > 
> >     $ qemu-img create -f qcow2 -b B.qcow2 \
> >         -o backing_fmt=qcow2 B'.qcow2
> > 
> >     [or]
> > 
> >     $ virsh vol-create-as default B'.qcow2 1G \
> >         --format qcow2 \
> >         --backing-vol B.qcow2 --backing-vol-format qcow2 
> 
> [side note - we should really teach libvirt to not REQUIRE a size when
> creating an empty wrapper around an existing image]

Filed: https://bugzilla.redhat.com/show_bug.cgi?id=1150411

> > 
> > (4) Do a shallow blockcopy of A' to B':
> > 
> >     $ virsh blockcopy \
> >         --domain vm1 vda /export/vmimages/B'.qcow2 \
> >         --wait --verbose --shallow \
> >         --finish
> 
> For this to work, you need to also use the --reuse-external flag

True, I self-corrected in my other response in this thread, but thanks
for noticing.

> to take
> advantage of the backing chain already recorded in B'.qcow2 (without the
> flag, the command will complain that B'.qcow2 already exists if it is a
> regular file; if it is a block device, it will just silently ignore the
> contents of the block device and treat B'.qcow2 as though an absolute
> path to A.qcow2 were its backing file).
> 
> > 
> > (5) Then live shallow commit of B:
> > 
> >     $ virsh blockcommit \
> >         --domain f20vm vda \
> >         --wait --verbose --shallow \
> >         --pivot --active --finish
> >     Block Commit: [100 %]
> >     Successfully pivoted
> 
> With steps 2 and 4 corrected, this indeed shortens the chain back down
> to just B.qcow2.  And once this happens, you no longer need the path to
> A.qcow2 or A'.qcow2; you can also delete B'.qcow2.  But back to the
> point I made earlier at [1]: if this is all you do, then 'virsh
> snapshot-list' will still show 'snap1' as a snapshot that tries to refer
> to A'.qcow2; since you just invalidated that with the copy, you'd need
> to 'virsh snapshot-delete --metadata vm1 snap1' to get rid of the stale
> snapshot (if you don't tweak step 1 to avoid creating that snapshot
> metadata in the first place).

Thanks for this reminder, I'll script this as part of my tests to ensure
it's not missed.

> The NICE part about this whole sequence is that the backing file does
> NOT have to be qcow2, and it is VERY efficient timewise, if you happen
> to have an efficient way to do step 2.  That is, I can go from a
> multi-gigabyte raw file A.img to raw file B.img in less than a second,
> assuming the guest isn't doing much I/O in the meantime, when scripting
> all these steps together, and without any guest downtime.

Thanks again, for your meticulous review.

-- 
/kashyap