[libvirt-users] virsh blockcopy: doesn't seem to flatten the chain by default
Kashyap Chamarthy
kchamart at redhat.com
Fri Jul 4 17:12:19 UTC 2014
On Thu, Jul 03, 2014 at 11:08:15AM -0600, Eric Blake wrote:
> On 07/02/2014 01:12 PM, Kashyap Chamarthy wrote:
>
> > We have this simple chain:
> >
> > base <- snap1
> >
> > Let's quickly examine the contents of 'base' and 'snap1' images:
> >
>
> > Now, let's do a live blockcopy (with a '--finish' to graecully finish
> > the mirroring):
> >
> > $ virsh blockcopy --domain testvm2 vda \
> > /export/dst/copy.qcow2 \
> > --wait --verbose --finish
>
> This defaults to a full copy (copy.qcow2 will contain everything in the
> latest state of the original chain, but with no backing file).
>
>
> >
> > If I'm reading the man page of 'blockcopy' correctly, shouldn't it
> > 'flatten' the entire chain, by also copying the contents of base into
> > copy.qcow2? i.e. the 'copy' should have files (including the file foo
> > from 'base':
> >
> > foo, bar, baz, jazz
> >
> >
> > True or false?
>
> False. This is NOT a union mount. Sometime in between base and snap1,
> you deleted foo.
Hmm, I do realize that if I deleted 'foo' in between the above two
points you mentioned, it _does_ reflect in snap1.
I realized it's me who made a silly mistake, as I quickly did another
test which validates your (very eloquent) details further below.
For completness' sake, a correct test below -- it's the simplest case of
blockcopy with a depth of chain of 1.
1. Create base image:
$ qemu-img create -f qcow2 base.qcow2 1G
2. Create a file system on the disk & add file 'foo':
---------------------------------
$ guestfish --rw -a /path/disk.qcow2
[. . .]
><fs> run
><fs> part-disk /dev/sda mbr
><fs> mkfs ext4 /dev/sda1
><fs> list-filesystems
><fs> mount /dev/sda1 /
><fs> touch /foo
><fs> touch /bar
><fs> ls /
foo
lost+found
><fs>exit
--------------------------------
2. Create a snapshot, 'snap1' with backing file as 'base':
$ qemu-img create -f qcow2 -b base.qcow2 \
-o backing_fmt=qcow2 snap1.qcow2
2.1. Examine contents of 'snap1', add a couple more files: bar, baz,
jazz:
--------------------------------
$ guestfish --rw -a snap1.qcow2
[. . .]
><fs> run
><fs> mout /dev/sda1 /
mout: unknown command
><fs> mount /dev/sda1 /
><fs> ls /
foo
lost+found
><fs> touch /bar
><fs> touch /baz
><fs> touch /jazz
><fs> ls /
bar
baz
foo
jazz
lost+found
--------------------------------
3. Provide SELinux context:
$ chcon -t svirt_image_t base.qcow2 snap1.qcow2
4. Create a persistent XML file:
----------
$ cat <<EOF > /etc/libvirt/qemu/testvm.xml
<domain type='kvm'>
<name>testvm</name>
<memory unit='MiB'>512</memory>
<vcpu>1</vcpu>
<os>
<type arch='x86_64'>hvm</type>
</os>
<devices>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/>
<source file='/export/src/snap1.qcow2'/>
<target dev='vda' bus='virtio'/>
</disk>
</devices>
</domain>
EOF
----------
5. Perform blockcopy:
$ virsh blockcopy --domain testvm vda \
/export/dst/copy.qcow2 \
--wait --verbose --finish
Block Copy: [100 %]
Successfully copied
6. Examine contents of copy.qcow2:
--------------------------------
$ guestfish --ro -a /export/dst/copy.qcow2
><fs> run
><fs> mount /dev/sda1 /
><fs> ls /
bar
baz
foo
jazz
lost+found
><fs> quit
--------------------------------
6.1. Enumerate the backing chain of copy.qcow2, it should be a
standalone image:
--------------------------------
$ qemu-img info --backing-chain /export/dst/copy.qcow2
image: /export/dst/copy.qcow2
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 18M
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
--------------------------------
> That is recorded in snap1, and when reading a chain,
> you stop at the first level of the chain that provides information.
> When flattening, it means you are inherently losing any information
> about the state that existed before snap1 changed the state, at least
> when using the flattened chain to try and find that information.
>
> Graphically (well, using ASCII), let's look at it like this. When you
> start your guest originally, you have a big blank disk being tracked by
> the base image, and write into some sectors of that disk. So, use "A"
> to represent the initial OS install, and "X" to represent a sector not
> yet written:
>
> base: AAAAXXXXXXXXXXXX
> ====
> guest: AAAAXXXXXXXXXXXX
>
> Then, you modify the guest to write the file foo, represent that with
> "B" for the sectors that were modified:
>
> base: AAABBBBBXXXXXXXX
> ====
> guest: AAABBBBBXXXXXXXX
>
> then you take a snapshot, at the point you take it, snap1 is completely
> empty, but notice that the guest view of the world is still unchanged:
>
> base: AAABBBBBXXXXXXXX
> snap1: XXXXXXXXXXXXXXXX
> ====
> guest: AAABBBBBXXXXXXXX
>
> now you do some more modification, such as deleting foo, and creating
> bar (note that deleting a file can be done by writing one sector, and
> may still leave remnants of the file behind in other sectors, but in
> such a way that the guest file system will never retrieve those
> contents). Represent these changes with "C"
>
> base: AAABBBBBXXXXXXXX
> snap1: XXXCXXXXCCCCXXXX
> ====
> guest: AAACBBBBCCCCXXXX
>
> When you are doing a full blockcopy, you are asking to create a new file
> whose contents match what the guest sees. When the copy finally reaches
> sync, you have:
>
> base: AAABBBBBXXXXXXXX
> snap1: XXXCXXXXCCCCXXXX
> copy: AAACBBBBCCCCXXXX
> ====
> guest: AAACBBBBCCCCXXXX
>
> The copy operation lasts as long as you want; in that time, the guest
> can make even more changes, let's call them "D"
>
> base: AAABBBBBXXXXXXXX
> snap1: XXXDXXXXCCCCDDDD
> copy: AAADBBBBCCCCDDDD
> ====
> guest: AAADBBBBCCCCDDDD
>
> then you finally abort or pivot the copy. Let's try a pivot, where the
> next action in the guest causes changes to the disk labeled "E":
>
> base: AAABBBBBXXXXXXXX
> snap1: XXXDXXXXCCCCDDDD
>
> copy: AAAEBBBBCCCCDDDD
> ====
> guest: AAAEBBBBCCCCDDDD
>
> >
> >
> > PS: I've tested the cases of --pivot, --shallow and --reuse-external,
> > will post my notes about them on a wiki.
>
> I hope those help you figure out what's going on.
They do, thanks for taking time to write these abundantly clear details.
As my newer test provied it was a PEBKAC. I really liked the way you
denote the 'guest' view and the disk/snapshot views.
> You seem to be hoping
> for a magic bullet that gives you file system union mounts (merge the
> contents of two different timestamps of a directories existence in a
> common file system) - but that is NOT what disk snapshots do.
In reality I wasn't expecting union mounts at all :-) I didn't think of
it actively untill you explicitly mentioned the topic with so much of
details.
> All
> libvirt and qemu can do is block level manipulations, not file system
> manipulations. I'm not even sure if a file system tool exists that can
> do file system checkpoints and/or union mount merges; but if it does, it
> would be something you use in the guest at the file system level, and
> not something libvirt can manage at the block device sector level.
Understood. Thanks again, for all these details, Eric.
--
/kashyap
More information about the libvirt-users
mailing list