[libvirt] latency between LIFECYCLE event and notification generation

Tue May 14 22:48:47 UTC 2013

On 05/14/2013 12:26 AM, nishant burte wrote:
> Thanks Eric and Daniel for the response.
> For the 2nd question, let me elaborate more.
> 
>>
>> 2. Second question is, can someone please explain what are the sequence of
>> steps happen between a VM going down and the notification is generated?
> 
> Lets say, the VM crashed. The question is, how does qemu (or hyperviser )
> come to know about it? and how does it generate notification?

What do you mean, the VM crashed?  It's an honest question - there are
two levels of crashing: the qemu process that is running the VM crashed
[host bug], or the guest itself went into a panic in some way observable
by qemu [guest bug].

Right now, qemu can only report the first level of crashing (a qemu
failure), and we HOPE those are rare.  You can also wire up a watchdog
device into your guest, where if the guest doesn't feed the watchdog
often enough, then qemu can detect that, again as a first level
approximation.

There are patches that have been accepted for qemu 1.5, but also depend
on using a new enough Linux kernel in the guest, that add a pvpanic
device.  With that device in place, if the guest detects a panic, then
it can write a last-ditch effort message on the dedicated device to give
second-level panic reporting.  Libvirt still needs to be wired up to
expose this second-level reporting in guest XML.  Also, while the device
could theoretically be used by any guest OS, right now I only know of
new enough Linux kernels that know how to use it (that is, I don't know
if anyone has written a Windows driver to be installed in a guest to
tell qemu when Windows goes into BSOD).

> 
> e.g. I am looking for explanation something on following lines.
> VM pings hyperviser periodically, when it is UP. When these heartbeats
> stop, hyperviser detects VM has gone down and then it sends the
> notification to libvirt.
> 
> Could you please give sequence of events on similar lines as given above?

Other than a watchdog device or a dedicated pvpanic deivce, I don't know
of any heartbeat at the qemu level.  A guest should run the same as is
does on bare metal, so how do you detect when a bare metal machine has
gone down?  If you can answer that (for example, if you you have a
heartbeat at the IP level for deciding when to fence a bare metal
guest), then you can also set up that same heartbeat to decide whether a
guest is still up - but it would be at a higher level than what
qemu/libvirt provide you.  At the libvirt layer, we generally try to
avoid relying on the guest any more than we have to (it's more secure if
you assume the guest is malicious and therefore avoid making your
behavior depend on actions by the guest).

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 621 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20130514/a448b79d/attachment-0001.sig>