[libvirt] [PATCH] qemu_agent: Issue guest-sync prior to every command

Eric Blake eblake at redhat.com
Tue Feb 7 16:11:57 UTC 2012


On 02/07/2012 08:49 AM, Michal Privoznik wrote:

>> We could still timeout the 'fs-freeze' command after 30 seconds
>> or so. Given that we issue the guest-resync command, we'll be
>> able to automatically re-sync the JSON protocol by dropping the
>> later arriving fs-freeze reply (if any).
> 
> I don't think this is a good idea. I've chosen 'fs-freeze' intentionally
> :) It's something that actually might take ages - to sync disks (which
> is what current implementation does). Therefore if we set any timeout
> for regular commands we may get into inconsistent state:
> 
> 1) issue fs-freeze
> 2) timeout and return error (everybody thinks fs is not frozen)
> 3) receive "okay, frozen" from GA

Question for the qemu-folks:

We've already documented that qemu-ga must be treated as an asynchronous
interface; callers cannot expect the client to reliably reply, and must
always have a timeout mechanism in place.  Doesn't that mean that any
guest agent command that might potentially be long-running should
instead be broken up into multiple commands, one to start the process,
and another to query whether the process has been completed?

That is, since fs-freeze might be potentially long-running, should we
break it into multiple commands:

fs-freeze-async requests that a freeze be started, and an immediate ack
returned if the process is started
fs-freeze-query returns the status of whether the system is thawed,
frozen, or in the process of transitioning

libvirt would then issue a guest-sync with reasonable timeout (to ensure
the agent is currently responsive, if it fails, the agent is not
available), then an fs-freeze-async with reasonable timeout (if that
fails, the freeze is not possible), then periodic fs-freeze-query until
the freeze completes (if any of them fail, assume the agent restarted,
but that the system is frozen, and therefore, libvirt should send an
fs-thaw command prior to returning failure, just in case).

>>
>> According to the 'guest-sync' QMP spec, we need to send the magic byte
>> '0xFF' immediately before the guest-sync command data is sent.
> 
> Yeah, and probably switch to new guest-sync-delimited command as soon as
> it's upstream.

If I'm understanding the recent proposals correctly, guest-sync exists
in 1.0 guest agents, but not guest-sync-delimited; we can always send
0xff, but we can only expect to receive 0xff if we use
guest-sync-delimited which means we need to probe to see if the guest
agent understands guest-sync-delimited.  Is it safe to send a 1.0 guest
a command it doesn't understand, like guest-sync-delimited, and expect
to get a reliable error message in reply?

-- 
Eric Blake   eblake at redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 620 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20120207/172e40e4/attachment-0001.sig>


More information about the libvir-list mailing list