[libvirt] [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

Dr. David Alan Gilbert dgilbert at redhat.com
Thu Oct 11 12:06:14 UTC 2018


* Peter Krempa (pkrempa at redhat.com) wrote:
> On Tue, Oct 09, 2018 at 19:34:43 +0200, Markus Armbruster wrote:
> > Cc: libvir-list for review of the compatibility argument below.
> > 
> > Daniel Henrique Barboza <danielhb413 at gmail.com> writes:
> > 
> > > Hey David,
> > >
> > > On 9/21/18 9:29 AM, Dr. David Alan Gilbert wrote:
> > >> * Daniel Henrique Barboza (danielhb413 at gmail.com) wrote:
> > >>> changes in v2:
> > >>> - removed the "RFC" marker;
> > >>> - added a new patch (patch 2) that removes
> > >>> bdrv_snapshot_delete_by_id_or_name from the code;
> > >>> - made changes in patch 1 as suggested by Murilo;
> > >>> - previous patch set link:
> > >>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html
> > >>>
> > >>>
> > >>> It is not uncommon to see bugs being opened by testers that attempt to
> > >>> create VM snapshots using HMP. It turns out that "0" and "1" are quite
> > >>> common snapshot names and they trigger a lot of bugs. I gave an example
> > >>> in the commit message of patch 1, but to sum up here: QEMU treats the
> > >>> input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. It
> > >>> is documented as such, but this can lead to strange situations.
> > >>>
> > >>> Given that it is strange for an API to consider a parameter to be 2 fields
> > >>> at the same time, and inadvently treating them as one or the other, and
> > >>> that removing the ID field is too drastic, my idea here is to keep the
> > >>> ID field for internal control, but do not let the user set it.
> > >>>
> > >>> I guess there's room for discussion about considering this change an API
> > >>> change or not. It doesn't affect users of HMP and it doesn't affect Libvirt,
> > >>> but simplifying the meaning of the parameters of savevm/loadvm/delvm.
> > >> Can you clarify a couple of things:
> > >>    a) What is it about libvirt's use that means it's OK ? Does it never
> > >> use numeric tags?
> > >
> > > I am glad you asked because my understanding in how Libvirt was dealing
> > > with snapshots was wrong, and I just looked into it further to answer your
> > > question. Luckily, this series fixes the situation for Libvirt as well.
> > >
> > > I was misled by the fact that Libvirt does not show the same symptoms
> > > we see in
> > > QEMU of this problem, but the bug is there. Here's a quick test with
> > > Libvirt with
> > > "0" and "1" as snapshot names, considering a VM named with no snapshots,
> > > using QEMU 2.12 and Libvirt 4.0.0:
> > >
> > > - create the "0" snapshot:
> > >
> > > $ sudo virsh snapshot-create-as --name 0 dhb
> > > Domain snapshot 0 created
> > >
> > > $ sudo virsh snapshot-list dhb
> > > Name Creation Time State
> > > ------------------------------------------------------------
> > > 0 2018-09-24 15:47:56 -0400 running
> > >
> > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> > > List of snapshots present on all disks:
> > > ID TAG VM SIZE DATE VM CLOCK
> > > --    0  405M 2018-09-24 15:47:56 00:04:20.867
> > >
> > >
> > > - created the "1" snapshot. Here, Libvirt shows both snapshots with
> > > snapshot-list,
> > > but the snapshot was erased inside QEMU:
> > >
> > > $ sudo virsh snapshot-create-as --name 1 dhb
> > > Domain snapshot 1 created
> > > $
> > > $ sudo virsh snapshot-list dhb
> > > Name Creation Time State
> > > ------------------------------------------------------------
> > > 0 2018-09-24 15:47:56 -0400 running
> > > 1 2018-09-24 15:50:09 -0400 running
> > >
> > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> > > List of snapshots present on all disks:
> > > ID TAG VM SIZE DATE VM CLOCK
> > > --    1  404M 2018-09-24 15:50:10 00:05:36.226
> > >
> > >
> > > This is where I stopped checking out Libvirt at first, concluding
> > > wrongly that it
> > > was immune to the bug.
> > >
> > > Libvirt does not throw an error when trying to apply snapshot 0. In
> > > fact, it acts
> > > like everything went fine:
> > >
> > > $ sudo virsh snapshot-revert dhb 0
> > >
> > > $ echo $?
> > > 0
> > 
> > Is that a libvirt bug?
> 
> Probably yes. The error handling from HMP sucks and can't really be
> fixed in all cases. This is for the handler which calls "loadvm":
> 
>     if (strstr(reply, "No block device supports snapshots") != NULL) {
>         virReportError(VIR_ERR_OPERATION_INVALID, "%s",
>                        _("this domain does not have a device to load snapshots"));
>         goto cleanup;
>     } else if (strstr(reply, "Could not find snapshot") != NULL) {
>         virReportError(VIR_ERR_OPERATION_INVALID,
>                        _("the snapshot '%s' does not exist, and was not loaded"),
>                        name);
>         goto cleanup;
>     } else if (strstr(reply, "Snapshots not supported on device") != NULL) {
>         virReportError(VIR_ERR_OPERATION_INVALID, "%s", reply);
>         goto cleanup;
>     } else if (strstr(reply, "Could not open VM state file") != NULL) {
>         virReportError(VIR_ERR_OPERATION_FAILED, "%s", reply);
>         goto cleanup;
>     } else if (strstr(reply, "Error") != NULL
>              && strstr(reply, "while loading VM state") != NULL) {
>         virReportError(VIR_ERR_OPERATION_FAILED, "%s", reply);
>         goto cleanup;
>     } else if (strstr(reply, "Error") != NULL
>              && strstr(reply, "while activating snapshot on") != NULL) {
>         virReportError(VIR_ERR_OPERATION_FAILED, "%s", reply);
>         goto cleanup;
>     }
> 
> If the above does not match the reported error, we report success. The
> same problem can happen with any of the 5 [1] HMP command handlers we
> still have.
> 
> Note that a similar abomination is also for the code which calls
> "savevm".

Would the following small qemu change make life a little safer:

diff --git a/hmp.c b/hmp.c
index 576253a01f..0729a8c7ed 100644
--- a/hmp.c
+++ b/hmp.c
@@ -62,7 +62,7 @@ static void hmp_handle_error(Monitor *mon, Error **errp)
 {
     assert(errp);
     if (*errp) {
-        error_report_err(*errp);
+        error_reportf_err(*errp, "Error: ");
     }
 }

changing:

No block device supports snapshots


to:

Error: No block device supports snapshots

so you could check for Error: at the start and know that you've hit some
unknown error?

Dave


> [1] We technically have 6 HMP handlers, but "cpu_set" is not used if yoy
> have a QEMU younger than 3 years. Soon also "drive_add" and "drive_del"
> will be replaced, so we'll be stuck with the 3 internal snapshot ones.
> > 
> > > Reverting back to snapshot "1" works as intended, restoring the VM
> > > state when it
> > > was created.
> > >
> > >
> > > (perhaps this is something we should let Libvirt be aware of ...)
> 
> The error message from qemu which was wrongly ignored by qemu can be
> found in the libvirtd debug log. It would be helpful if you could
> provide it.
> 
> > >
> > >
> > >
> > > This series fixes this behavior because the snapshot 0 isn't erased
> > > from QEMU, allowing
> > > Libvirt to successfully restore it.
> > >
> > >
> > >>    b) After this series are you always guaranteed to be able to fix
> > >> any existing oddly named snapshots?
> > >
> > > The oddly named snapshots that already exists can be affected by the
> > > semantic
> > > change in loadvm and delvm, in a way that the user can't load/del
> > > using the snap
> > > ID, just the tag. Aside from that, I don't see any side effects with
> > > existing
> > > snapshots and this patch series.
> > 
> > Do all snapshots have a tag that is unique within their image?  Even
> > snapshots created by old versions of QEMU?
> 
> I remember there was a discussion which regarded problems with
> collisions between the name and the ID of the snapshot when dealing with
> loadvm/delvm commands but I can't seem to find it currently. Note that
> libvirt plainly issues loadvm/delvm $SNAPSHOTNAME. If the name is
> numeric it might clash. Similarly for inactive vms via qemu-img.


--
Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK




More information about the libvir-list mailing list