[libvirt] [PATCH 3/3] qemu: agent: take monitor lock in qemuAgentNotifyEvent

John Ferlan jferlan at redhat.com
Fri Dec 9 14:42:38 UTC 2016

On 12/09/2016 03:55 AM, Nikolay Shirokovskiy wrote:
> On 08.12.2016 19:40, John Ferlan wrote:
>> On 11/24/2016 04:19 AM, Nikolay Shirokovskiy wrote:
>>> qemuAgentNotifyEvent notify on a lock condition without taking
>>> the lock. This works but it is a subject to race conditions.
>>> ---
>>>  src/qemu/qemu_agent.c | 4 ++++
>>>  1 file changed, 4 insertions(+)
>> But the vm is locked prior to any priv->agent dereference and call - so
>> what path could free priv->agent before we get into this NotifyEvent
> This patche does not fix NULL or dangling pointer reference...
>> code?  I suppose it wouldn't hurt, but we're not entering the agent and
>> the AgentEOF would require vm lock to clear agent.
> It is just we can not deliver signal reliably if lock is not taken. It is
> race condition. Acessing monitor fields without lock in general is risky too.
> For example imagine we are in the process of sending agent command, vm lock
> is dropped, sending thread holds agent lock and at the same time reset comes -
> it is possible as nobody holds vm lock, so we come here and now we have
> dangerous simultaneous access from both threads - sending and that notifies us
> of a reset.

OK - and this is the type of information/details that should go into the
commit message.  When you chase something down, leave a few bread crumbs
of details.

Anyway the details are, await_event is only set for qemuAgentShutdown
and qemuAgentSuspend. Both have the agent lock to do so and their
callers are the only ones that care. So the issue is that one of those
two is called and at some time after releasing the domain lock and
before setting await_event to RESET, SHUTDOWN, or SUSPEND that
qemuAgentNotifyEvent is called from qemuProcessHandleReset,
qemuProcessHandleShutdown, qemuProcessHandlePMSuspend, or
qemuProcessHandlePMSuspendDisk which only take the vmobj lock before
looking at agent fields.

With that kind of detail provided in the commit message wouldn't leave
me wondering what was being chased. As you saw my first assumption was
the race you were referring to is the agent free race. With the extra
details I think I'm able to piece together what race you are referring
to. So ACK to the patch, but when you create your v2 for this series,
please update the commit message with more details. Feel free to
liberally steal the above paragraph.

> Nikolay
>>> diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c
>>> index 5230cbc..ad031d0 100644
>>> --- a/src/qemu/qemu_agent.c
>>> +++ b/src/qemu/qemu_agent.c
>>> @@ -1248,6 +1248,8 @@ qemuAgentMakeStringsArray(const char **strings, unsigned int len)
>>>  void qemuAgentNotifyEvent(qemuAgentPtr mon,
>>>                            qemuAgentEvent event)
>>>  {
>>> +    virObjectLock(mon);
>>> +
>>>      VIR_DEBUG("mon=%p event=%d await_event=%d", mon, event, mon->await_event);
>>>      if (mon->await_event == event) {
>>>          mon->await_event = QEMU_AGENT_EVENT_NONE;
>>> @@ -1257,6 +1259,8 @@ void qemuAgentNotifyEvent(qemuAgentPtr mon,
>>>              virCondSignal(&mon->notify);
>>>          }
>>>      }
>>> +
>>> +    virObjectUnlock(mon);
>>>  }
>>>  VIR_ENUM_DECL(qemuAgentShutdownMode);

More information about the libvir-list mailing list