[libvirt] Libvirt domain event usage and consistency

Michal Privoznik mprivozn at redhat.com
Fri Nov 25 19:23:46 UTC 2016


On 25.11.2016 18:33, Roman Mohr wrote:
> On Fri, Nov 25, 2016 at 6:04 PM, Michal Privoznik <mprivozn at redhat.com>
> wrote:
> 
>> On 25.11.2016 17:54, Roman Mohr wrote:
>>> On Fri, Nov 25, 2016 at 4:34 PM, Michal Privoznik <mprivozn at redhat.com>
>>> wrote:
>>>
>>>> On 25.11.2016 14:38, Roman Mohr wrote:
>>>>> Hi,
>>>>>
>>>>> I recently started to use the libvirt domain events. With them I
>> increase
>>>>> the responsiveness of my VM state wachers.
>>>>> In general it works pretty well. I just listen to the events and do a
>>>>> periodic resync to cope with missed events.
>>>>>
>>>>> While watching the events I ran into a few interesting situations I
>>>> wanted
>>>>> to share. The points 1-3 describe some minor issues or irregularities.
>>>>> Point 4 is about the fact that domain and state updates are not
>> versioned
>>>>> which makes it very hard to stay in sync with libvirt when using
>> events.
>>>>>
>>>>> My libvirt version is 1.2.18.4.
>>>>
>>>> This might be the root cause. I'm unable to see some of the scenarios
>>>> you're seeing. Have you tried the latest release (or even git HEAD) to
>>>> check whether all the scenarios you are describing still stand?
>>>>
>>>
>>> Definitely better with latest HEAD but still it does not look completely
>>> right.
>>>
>>>>
>>>>>
>>>>> 1) Event order seems to be weird on startup:
>>>>>
>>>>> When listening for VM lifecycle events I get this order:
>>>>>
>>>>> {"event_type": "Started", "timestamp": "2016-11-25T11:59:53.209326Z",
>>>>> "reason": "Booted", "domain_name": "generic", "domain_id":
>>>>> "8ff7047b-fb46-44ff-a4c6-7c20c73ab86e"}
>>>>> {"event_type": "Defined", "timestamp": "2016-11-25T11:59:53.435530Z",
>>>>> "reason": "Added", "domain_name": "generic", "domain_id":
>>>>> "8ff7047b-fb46-44ff-a4c6-7c20c73ab86e"}
>>>>>
>>>>> It is strange that a VM already boots before it is defined. Is this the
>>>>> intended order?
>>>>
>>>> I don't see this order so probable this is fixed upstream.
>>>>
>>>
>>> On latest master a normal creation emits these events:
>>>
>>> event 'lifecycle' for domain testvm: Resumed Unpaused
>>> event 'lifecycle' for domain testvm: Started Booted
>>>
>>> The Resumed event looks wrong. Further I get no more Defined/Undefined
>>> events. Maybe they were removed?
>>
>> Yes, they were removed.
> 
> 
> Nice
> 
> 
>> The Resumed event comes from qemu actually,
>> because libvirt starts qemu in paused mode so that we can do some setup
>> (e.g. place vcpu threads into cgroups) and only after that we can resume
>> guest CPUs and in fact let guest start. Once this is done we
>> deliberately emit Started event.
>>
> 
> I would expect an event like this:
> 
> event 'lifecycle' for domain testvm: Suspended Bootstrapping
> 
> before the other two events. That takes the ambiguity from the Resumed
> event.
> 
> 
>>>
>>>
>>>>
>>>>>
>>>>> 2) Defining a VM with VIR_DOMAIN_START_PAUSED gives me this event order
>>>>
>>>> I don't think you can define a domain with that flag. What's the actual
>>>> action?
>>>>
>>>
>>> That is the flag for the api, when using virsh using `--paused` does
>> that.
>>
>> Ah, that's for virsh create/start not virsh define. Anyway, this is no
>> longer the case with upstream, is it?
>>
>>
> Right
> 
> 
>>>
>>>
>>>>
>>>>>
>>>>> {"event_type": "Defined", "timestamp": "2016-11-25T12:02:44.037817Z",
>>>>> "reason": "Added", "domain_name": "core_node", "domain_id":
>>>>> "b9906489-6d5b-40f8-a742-ca71b2b84277"}
>>>>> {"event_type": "Resumed", "timestamp": "2016-11-25T12:02:44.813104Z",
>>>>> "reason": "Unpaused", "domain_name": "core_node", "domain_id":
>>>>> "b9906489-6d5b-40f8-a742-ca71b2b84277"}
>>>>> {"event_type": "Started", "timestamp": "2016-11-25T12:02:44.813733Z",
>>>>> "reason": "Booted", "domain_name": "core_node", "domain_id":
>>>>> "b9906489-6d5b-40f8-a742-ca71b2b84277"}
>>>>
>>>>
>>>> Interesting, so here is "defined" event delivered before the "started"
>>>> event. Also - where is "suspended" event?
>>>>
>>>
>>>
>>> With latest master the situation looks better. Now I see
>>>
>>> event 'lifecycle' for domain testvm: Started Booted
>>> event 'lifecycle' for domain testvm: Suspended Paused
>>
>> Again, both of these are deliberately emitted by libvirt and in fact I
>> think they reflect what is happening.
>>
>>
> Why is in this case  not
> 
>  event 'lifecycle' for domain testvm: Resumed Unpaused
> 
> event emitted?
> 
> I would expect
> 
> event 'lifecycle' for domain testvm: Resumed Unpaused

I fail to see why this is supposed to be emitted. The QEMU is started as
paused. If this even would be emitted, even if immediately followed by
Paused event, it might suggest that guest CPUs have run for a little
while. But that's not true.

> event 'lifecycle' for domain testvm: Started Booted
> event 'lifecycle' for domain testvm: Suspended Paused
> 
> 
> So the situation on master is much better but because of the
> Resumed/Unpaused event I still have the feeling that the most simple but
> powerful usecase, watching for CREATE, UPDATE, DELETE is very hard because
> you can't know if the Resumed/Unpaused is the indicator for CREATE or
> UPDATE.
> 

It's neither. Resumed event reflects vCPU state, not the domain XML
change. Maybe we need to document that better.

For domain create event you need to watch for Started lifecycle event.
For domain update event you need to watch for device-* event.
For domain shutdown event you need to watch for Stopped lifecycle event.

Michal




More information about the libvir-list mailing list