[libvirt] [Qemu-devel] [PATCH RFC 0/4] Allow hibernation on guests

Luiz Capitulino lcapitulino at redhat.com
Thu Jan 26 20:13:55 UTC 2012


On Thu, 26 Jan 2012 20:41:13 +0100
Michal Privoznik <mprivozn at redhat.com> wrote:

> On 26.01.2012 20:35, Luiz Capitulino wrote:
> > On Thu, 26 Jan 2012 08:18:03 -0700
> > Eric Blake <eblake at redhat.com> wrote:
> > 
> >> [adding qemu-devel]
> >>
> >> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
> >>>> One thing, that you'll probably notice is this
> >>>> 'set-support-level' command. Basically, it tells GA what qemu version
> >>>> is it running on. Ideally, this should be done as soon as
> >>>> GA starts up. However, that cannot be determined from outside
> >>>> world as GA doesn't emit any events yet.
> >>>> Ideally^2 this command should be left out as it should be qemu
> >>>> who tells its own agent this kind of information.
> >>>> Anyway, I was going to call this command in qemuProcess{Startup,
> >>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs
> >>>> so guest can boot and start GA, but that implies returning from qemuProcess*.
> >>>>
> >>>> So I am setting this just before 'guest-suspend' command, as
> >>>> there is one more thing about GA. It is unable to remember anything
> >>>> upon its restart (GA process). Which has BTW show flaw
> >>>> in our current code with FS freeze & thaw. If we freeze guest
> >>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as
> >>>> GA thinks FS are not frozen. But that's a different cup of tea.
> >>>>
> >>>> Because of what written above, we need to call set-level
> >>>> on every suspend.
> >>>
> >>>
> >>> IMHO all this says that the 'set-level' command is a conceptually
> >>> unfixably broken design & should be killed in QEMU before it turns
> >>> into an even bigger mess.
> > 
> > Can you elaborate on this? Michal and I talked on irc about making the
> > compatibility level persistent, would that help?
> > 
> >>> Once we're in a situation where we need to call 'set-level' prior
> >>> to every single invocation, you might as well just allow the QEMU
> >>> version number to be passed in directly as an arg to the command
> >>> you are running directly thus avoiding this horrificness.
> >>
> >> Qemu folks, would you care to chime in on this?
> >>
> >> Exactly how is the set-level command supposed to work?  As I understand
> >> it, the goal is that if the guest has qemu-ga 1.1 installed, but is
> >> being run by qemu 1.0, then we want to ensure that any guest agent
> >> command supported by qemu-ga 1.1 but requiring features of qemu not
> >> present in qemu 1.0 will be properly rejected.
> > 
> > Not exactly, the default support of qemu-ga is qemu 1.0. This means that by
> > default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This
> > way the set-support-level command allows you to specify that qemu 2.0 features
> > are supported.
> > 
> > Note that this is only about specific features that depend on host support,
> > like S3 suspend which is known to be buggy in current and old qemu.
> > 
> >> But whose job is it to tell the guest agent what version of qemu is
> >> running?  Based on the above conversation, it looks like the current
> >> qemu implementation does not do any handshaking on its own when the
> >> guest agent first comes alive, which means that you are forcing the work
> >> on the management app (libvirt).  And this is inherently racy - if the
> >> guest is allowed to restart its qemu-ga process at will, and each
> >> restart of that guest process triggers a need to redo the handshake,
> >> then libvirt can never reliably know what version the agent is running at.
> > 
> > Making the set-support-level persistent would solve it, wouldn't it?
> 
> Yes and no. We still need an event when GA come to live. Because if
> anybody tries to write something for GA which is not running (and for
> purpose of this scenario assume it never will), like 'set-support-level'
> and wait for answer (which will never come) he will be blocked
> indefinitely. However, if he writes it after 1st event come, everything
> is OK.

What if the event never reach libvirt?

This problem is a lot more general and is not related to the
set-support-level command. Maybe adding shutdown & start events can serve as
good hints, but they won't fix the problem.

IMHO, the best way to solve this is to issue the guest-sync command with
a timeout. If you get no answer, then try again later.




More information about the libvir-list mailing list