RFC: revert to external snapshot API

Mon Sep 13 14:52:49 UTC 2021

ср, 8 сент. 2021 г. в 12:27, Peter Krempa <pkrempa at redhat.com>:

> On Tue, Aug 31, 2021 at 16:46:25 +0300, Nikolay Shirokovskiy wrote:
> > Hi, all.
>
> Hi, sorry for the late reply I was on PTO.
>
>
> > I want to implement reverting to external snapshot functionality.
> > I've checked the mailing list history and found some previous attempts
> > (not sure if this is a complete list of them).
> >
> > [1] was done in 2012 by the Redhat team itself. [2] and [3] was done in
> > 2018.
> > Looks like it was not clear how the API should look like.
>
> One additional thing is that me and phrdina started discussing this (in
> person so I can't point you to a discussion) 2 weeks ago.
>
> I'll summarize the points we agreed upon.
>
> > For example we have disk.qcow2 <- disk.snap1 chain <- disk.snap2 chain
> > (after
> > having snap1 and snap2 snapshots). Now we want to revert to snap1
> snapshot.
>
> There's one implementation snag we currently have which complicates
> stuff. Let's expand your above scenario with the snapshot states:
>
>
>                            s                      s
>                            n                      n
>                            a                      a
>                            p                      p
>                            1                      2
>                                                                           p
> o                          │                      │                       r
> r    base.qcow2            │    snap1.qcow2       │    snap2.qcow2        e
> i ───────────────────────► │ ────────────────────►│ ────────────────────► s
> g                          │                      │                       e
> i                          │                      │                       n
> n                                                                         t
>
> A rather big set of problems which we will necessarily encounter when
> implementing this comes from the following:
>
> 1) users can modify the VM betwen snapshots or between the last snapshot
> and the abandoned state and this modification (e.g. changing a disk
> image) must be considered to prevent data loss when manipulating the
> images.
>

Hi, Peter!

Can you please explain in more detail what the issue is? For example if
after snap1 I changed disk image to some base2.qcow2 then snap1.qcow2 is
expected to be managed by mgmt (expected to be deleted I guess). Then I made
a snap2. Now I have base.qcow2 (keeping snap1 state) and  base2.qcow2
(keeping snap2 state) <-snap2.qcow2
chain. Thanks to metadata I can revert to both snap1 and snap2 as I know
the domain
xml in those states and these xmls references right images (snap1
references base.qcow2
and snap2 references base2.qcow2).

> 2) libvirt doesn't record the XML used to create the snapshot (e.g.
>

It is not quite so. According to
https://libvirt.org/formatsnapshot.html#example
xml given to create API saved in snapshot metadata. Or I miss something.

> snap1.xm) thus we don't actually know what the (historical) snapshot was
> actually recording. Coupled with the fact that the user could have
> changed the VM definition between 'snap1' and 'snap2' we can't even
> infer what the state was.
>

Here too I cannot follow what the issue is. Can you please give an example?

>
>
> > The
> > snapshot state is held by disk.qcow2 image. We can run reverted domain on
> > disk.qcow2 itself but then we lose snap1 (held by disk.qcow2) and snap2
> > (held by disk.snap1). So we definitely should run on some overlay over
> > disk.qcow2. But then what name should be given to overlay? We should have
> > an option for mgmt to specify this name like in case of snapshots itself.
>
> Exactly. Reversion of external snapshots will necessarily require a new
> API, which will take a new "snapshot" XML describing the new overlays as
> you describe below.
>
> In the simple case such as with local files we can use the same
> algorithm for creating overlay filenames as we do when creating
> snapshots but generally we need to give the MGMT the ability to specify
> the overlay name.
>
>
> > The [1] has some discussion on adding extra options to reverting
> > functionality.
> > Like for example if we revert back to snap2 then having the ability to
> run
> > from
> > state in disk.snap2 rather than disk.snap1. My opinion is there is no
> need
> > to
> > as if one wants to revert to the state of disk2.snap2 it can take a
> > snapshot (some
> > snap3).
>
> It's possible to avoid doing a combined "take snapshot&revert" operation
> as long as we have the possibility to take a snapshot and destroy the
> VM as running it after the point you want to return to is undesirable
> and pointless.
>
> One thing that you need to consider here is that when you are reverting
> to an arbitrary snapshot, the overlay files after the last snapshot
> (e.g. snap2.qcow2 (in my diagram), which is the state between snap2 and
> present) are abandoned and will never be used again.
>
> At this point we disucssed that we should be removing those as
> semantically it's the only point where we can do that as we know the
> state of the VM which will be abandoned.
>
> The problem lies in the fact that between 'snap2' and present the user
> could have exchanged disks and then we don't have enough metadata to
> create the overlays.
>
> One big additional caveat is that if the user exchanged disk images we
> _must not_ delete the changed image, so the metadata wrangling might be
> non-trivial in this case.
>
> > At the same time one needs to be aware that revert operation loses
> > current state and later one can revert only to the states of snapshots.
> > This is the way internal snapshots work and the way one expects external
> > snapshots to work too.
>
> That is an acceptable caveat for the user. As noted above as long as you
> can take a snapshot & atomicaly destroy the VM it's acceptable.
>
> For libvirt it's harder a bit as described above especially if we don't
> want to keep litering the disk with unused and invalid images.
>
> > The [2] takes an approach of reusing current top image as overlay on
> revert
> > so
> > that in the above example disk.snap2 will be overlay over disk.qcow2 on
> > reverting to snap1 snapshot. IMHO this is a confusing naming scheme.
>
> No in our discussion with Pavel  we've ruled out all special cases as
> too much hassle to implement with little benefit.
>
> If a user wants to rever&abandon some state, they can do 2 operations of
> reverting and then deleting the unwanted snapshots.
>
> > > The [3] suggests a different scheme for naming images. For example
> after
> > taking
> > snapshot snap1 the chain should be disk.snap1 <- disk.qcow2 which looks
> very
> > appealing to me. With this naming using the [2] approach is quite
> natural.
>
> IMO we can't really change the naming any more, users baked scripts on
> top of this. Using a flag is also mostly pointless as users will keep
> creating snapshots without it. Thus I don't think this is a viable
> approach. The only advantage it would have that it would be simpler to
> vizualize what happened, but ideally users would not need to care about
> the steps and filenames at all.
>
> > Implementing this does not look hard even for a running domain but this
> is
> > a big change to API and all mgmt need to be aware of (well it could be
> done
> > safely using a new flag).
>
> Yup, too much hassle, questionable outcome. Not worth it.
>
> >
> > Anyway we can go on with current image names. In order to specify overlay
> > image name let's introduce new API:
> >
> >     int virDomainRevertToSnapshotXML(virDomainSnapshotPtr snapshot,
> >                                      char *xmlDesc,
> >                                      unsigned int flag
>
> I was briefly considering that the new snapshot api should also have
> provisions for specifying a snapshot XML to take the snapshot of the
> abandoned state, but as discussed it's not really necessary if you can
> do it in 2 steps.
>
> > with XML like:
> >
> >     <domainsnapshotrevert>
> >       <disks>
> >         <disk name='vda'>
> >           <source file='/path/to/revert/overlay'/>
> >         </disk>
> >       </disks>
> >     </domainsnapshotrevert>
> >
> > Having an XML looks like a bit overkill right now but I could not
> > find a better solution.
>
> I don't see any other option. In case you have e.g. networked storage
> where we can't figure out stuff by ourselves you basically don't have
> other option.
>
> > If overlay name is omitted the generated name will be like disk.snap1-1
> in
> > the
> > above example to be in alignment with creating a snapshot case. So that
> > disk.snap1*
> > images hold states derived from snap1 state. We can also support
> reverting
> > to
> > external snapshot thru existing virDomainRevertToSnapshot for those who
> > rely on
> > generated names using this naming scheme.
>
> Yes, this would be possible, but quite fragile. But with regular
> snapshots without any magic it should work.
>
>
> One thing you've missed though is that deletion of snapshots now becomes
> quite a challenge.
>

Yeah I did not consider implementing deleting at that moment. However to
make
external snapshots usable it should be implemented as well.

Anyway is anybody in Red Hat working on this (considering you and Pavel
discussed
the topic recently)? If not I would like to proceed with implementation.

Nikolay

>
>
>                      s              s              s
>                      n              n              n
>                      a              a              a
>                      p              p              p
>                      1              2              3
>
> o                    │              │              │
> r    base.qcow2      │  snap1.qcow2 │  snap2.qcow2 │
> i ─────────────────► │ ────────────►│ ────────────►│
> g                    │              │              │
> i                    │              │              │
> n                    │
>                      │                                    p
>                      │                                    r
>                      │  alternatehistory.qcow2            e
>                      │ ────────────────────────────────►  s
>                      │                                    e
>                      │                                    n
>                                                           t
>
> Specifically the premise is that we don't want to keep unnecessary
> images (e.g. the trivial solution for deleting 'snap1' would be to
> remove the metadata and keep base.qcow2 intact) as without the metadata
> it will become very hard to determine which images are actually still
> used.
>
> So in the above scenario:
>
> - User wants to delete snap3:
>   'snap2.qcow2' overlay becomes unreachable and should be deleted.
>
> - User wants to delete snap2:
>   There are 2 options in this case:
>     1) block-pull snap1 into snap2
>     2) block-commit snap2 into snap1
>
>   The optimal solution is the one that transfers less data, but in the
>   end either of them is equally good in terms of saved storage.
>
> - User wants to delete snap1:
>   This is a bit harder:
>    1) commit into base.qcow2 would invalidate 'alternatehistory.qcow2'
>    2) pull of base into both snap1.qcow2 and alternatehistory.qcow2 is
>       possible, but may yield big images
>    3) the last option is to forbid deletion of snapshots that has
>       alternate histories
>
> The first thing we should do though is to forbid creation of mixed trees
> of internal and external snapshots because the matrix of stuff to handle
> would explode way beyond this already rather complex situation.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20210913/a426352a/attachment-0001.htm>