[libvirt] [Qemu-devel] [PATCH RFC 0/4] Allow hibernation on guests

Michael Roth mdroth at linux.vnet.ibm.com
Mon Jan 30 15:43:42 UTC 2012


On 01/30/2012 08:44 AM, Luiz Capitulino wrote:
> On Mon, 30 Jan 2012 07:54:56 -0600
> Anthony Liguori<anthony at codemonkey.ws>  wrote:
>
>> On 01/30/2012 06:57 AM, Luiz Capitulino wrote:
>>> On Thu, 26 Jan 2012 16:57:01 -0600
>>> Anthony Liguori<anthony at codemonkey.ws>   wrote:
>>>
>>>> On 01/26/2012 01:35 PM, Luiz Capitulino wrote:
>>>>> On Thu, 26 Jan 2012 08:18:03 -0700
>>>>> Eric Blake<eblake at redhat.com>    wrote:
>>>>>
>>>>>> [adding qemu-devel]
>>>>>>
>>>>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
>>>>>>>> One thing, that you'll probably notice is this
>>>>>>>> 'set-support-level' command. Basically, it tells GA what qemu version
>>>>>>>> is it running on. Ideally, this should be done as soon as
>>>>>>>> GA starts up. However, that cannot be determined from outside
>>>>>>>> world as GA doesn't emit any events yet.
>>>>>>>> Ideally^2 this command should be left out as it should be qemu
>>>>>>>> who tells its own agent this kind of information.
>>>>>>>> Anyway, I was going to call this command in qemuProcess{Startup,
>>>>>>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs
>>>>>>>> so guest can boot and start GA, but that implies returning from qemuProcess*.
>>>>>>>>
>>>>>>>> So I am setting this just before 'guest-suspend' command, as
>>>>>>>> there is one more thing about GA. It is unable to remember anything
>>>>>>>> upon its restart (GA process). Which has BTW show flaw
>>>>>>>> in our current code with FS freeze&    thaw. If we freeze guest
>>>>>>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as
>>>>>>>> GA thinks FS are not frozen. But that's a different cup of tea.
>>>>>>>>
>>>>>>>> Because of what written above, we need to call set-level
>>>>>>>> on every suspend.
>>>>>>>
>>>>>>>
>>>>>>> IMHO all this says that the 'set-level' command is a conceptually
>>>>>>> unfixably broken design&    should be killed in QEMU before it turns
>>>>>>> into an even bigger mess.
>>>>>
>>>>> Can you elaborate on this? Michal and I talked on irc about making the
>>>>> compatibility level persistent, would that help?
>>>>>
>>>>>>> Once we're in a situation where we need to call 'set-level' prior
>>>>>>> to every single invocation, you might as well just allow the QEMU
>>>>>>> version number to be passed in directly as an arg to the command
>>>>>>> you are running directly thus avoiding this horrificness.
>>>>>>
>>>>>> Qemu folks, would you care to chime in on this?
>>>>>>
>>>>>> Exactly how is the set-level command supposed to work?  As I understand
>>>>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is
>>>>>> being run by qemu 1.0, then we want to ensure that any guest agent
>>>>>> command supported by qemu-ga 1.1 but requiring features of qemu not
>>>>>> present in qemu 1.0 will be properly rejected.
>>>>>
>>>>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that by
>>>>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This
>>>>> way the set-support-level command allows you to specify that qemu 2.0 features
>>>>> are supported.
>>>>
>>>> Version numbers are meaningless.  What happens when a bunch of features get
>>>> backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of
>>>> 2.0?
>>>>
>>>> The feature negotiation mechanism we have in QMP is the existence of a command.
>>>>     If we're in a position where we're trying to disable part of a command, it
>>>> simply means that we should have multiple commands such that we can just remove
>>>> the disabled part entirely.
>>>
>>> You may have a point that we shouldn't be using the version number for that,
>>> but just switching to multiple commands doesn't solve the fundamental problem.
>>>
>>> The fundamental problem is that, S3 in current (and old) qemu has two known bugs:
>>>
>>>    1. The screen is left black after S3 (it's a bug in seabios)
>>>    2. QEMU resumes the guest immediately (Gerd posted patches to address this)
>>>
>>> We're going to address both issues in 1.1. However, if qemu-ga is installed in
>>> an old qemu and S3 is used, the bugs will be triggered.
>>
>> It's a management tool problem.
>>
>> Before a management tool issues a command, it should query the existence of the
>> command to determine whether this version of QEMU has that capability.  If the
>> tool needs to use two commands, it should query the existence of both of them.
>>
>> In this case, the management tool needs a qemu-ga command *and* a QEMU command
>> (to resume from suspend) so it should query both of them.
>>
>> Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3
>> worked in QEMU as expected.
>
> That's right, it's a coincidence, but I don't see why this wouldn't work.
>
> I think we should do the following then:
>
>   1. Drop the set-support-level command
>   2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid,
>      guest-suspend-disk
>   3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing
>      the guest-suspend-ram command
>
> Michal, Michael, do you agree?

Yup yup, looks sound to me.

>
>> Alternatively, if there really was no reason to have a resume-from-suspend
>> command, this would be the point where we would add a capabilities command
>> adding the "working-s3" capability.
>>
>> But with capabilities, this is a direct QEMU->management tool interaction, not a
>> proxy through the guest agent.
>>
>> We shouldn't trust the guest agent and we certainly don't want to rely on the
>> guest agent to avoid sending an improper command to QEMU!  That would be a
>> security issue.
>
> Fair enough, although that makes me wonder if the planned feature of pulling
> qemu-ga's commands into the QMP namespace won't have similar security issues.
>
>>
>>>
>>> We need a way for qemu-ga to query qemu about the existence of a working S3
>>> support. The set-support-level solves that.
>>
>> qemu-ga is not an entry point for QEMU features.  It's strictly a mechanism to
>> ask the guest to do something.  If we need to interact with QEMU directly to
>> query a capability and/or presence of a command, then we should talk to QEMU
>> directly.
>>
>> To put it another way, a management tool MUST deal with the fact that when
>> issuing the suspend-to-ram command, a guest may ignore it or attempt to do
>> something malicious.
>>
>> Regards,
>>
>> Anthony Liguori
>>
>>
>>> Another option would be to disable (or enable) S3 by default in qemu-ga, and let
>>> the admin enable (or disable it) according to S3 support being fixed in qemu.
>>>
>>
>
>




More information about the libvir-list mailing list