[libvirt] [PATCH 3/3] qemu: agent: take monitor lock in qemuAgentNotifyEvent

Nikolay Shirokovskiy nshirokovskiy at virtuozzo.com
Mon Dec 12 07:42:24 UTC 2016



On 09.12.2016 17:42, John Ferlan wrote:
> 
> 
> On 12/09/2016 03:55 AM, Nikolay Shirokovskiy wrote:
>>
>>
>> On 08.12.2016 19:40, John Ferlan wrote:
>>>
>>>
>>> On 11/24/2016 04:19 AM, Nikolay Shirokovskiy wrote:
>>>> qemuAgentNotifyEvent notify on a lock condition without taking
>>>> the lock. This works but it is a subject to race conditions.
>>>> ---
>>>>  src/qemu/qemu_agent.c | 4 ++++
>>>>  1 file changed, 4 insertions(+)
>>>>
>>>
>>> But the vm is locked prior to any priv->agent dereference and call - so
>>> what path could free priv->agent before we get into this NotifyEvent
>>
>> This patche does not fix NULL or dangling pointer reference...
>>
>>> code?  I suppose it wouldn't hurt, but we're not entering the agent and
>>> the AgentEOF would require vm lock to clear agent.
>>
>> It is just we can not deliver signal reliably if lock is not taken. It is
>> race condition. Acessing monitor fields without lock in general is risky too.
>>
>> For example imagine we are in the process of sending agent command, vm lock
>> is dropped, sending thread holds agent lock and at the same time reset comes -
>> it is possible as nobody holds vm lock, so we come here and now we have
>> dangerous simultaneous access from both threads - sending and that notifies us
>> of a reset.
> 
> 
> OK - and this is the type of information/details that should go into the
> commit message.  When you chase something down, leave a few bread crumbs
> of details.
> 
> Anyway the details are, await_event is only set for qemuAgentShutdown
> and qemuAgentSuspend. Both have the agent lock to do so and their
> callers are the only ones that care. So the issue is that one of those
> two is called and at some time after releasing the domain lock and
> before setting await_event to RESET, SHUTDOWN, or SUSPEND that
> qemuAgentNotifyEvent is called from qemuProcessHandleReset,
> qemuProcessHandleShutdown, qemuProcessHandlePMSuspend, or
> qemuProcessHandlePMSuspendDisk which only take the vmobj lock before
> looking at agent fields.
> 
> With that kind of detail provided in the commit message wouldn't leave
> me wondering what was being chased. As you saw my first assumption was
> the race you were referring to is the agent free race. With the extra
> details I think I'm able to piece together what race you are referring
> to. So ACK to the patch, but when you create your v2 for this series,
> please update the commit message with more details. Feel free to
> liberally steal the above paragraph.
> 

Ok.

Honestly I did not think of a detailed race screnario when I send a patch.
It is just hit me that notify is called without a lock, I thought before it 
is incorrect code. As it turns out it is not, one can call notify without a
lock. However there is a reason why it is worth taking lock anyway - 
http://stackoverflow.com/questions/4544234/calling-pthread-cond-signal-without-locking-mutex

Nikolay

>>
>>
>>>
>>>> diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c
>>>> index 5230cbc..ad031d0 100644
>>>> --- a/src/qemu/qemu_agent.c
>>>> +++ b/src/qemu/qemu_agent.c
>>>> @@ -1248,6 +1248,8 @@ qemuAgentMakeStringsArray(const char **strings, unsigned int len)
>>>>  void qemuAgentNotifyEvent(qemuAgentPtr mon,
>>>>                            qemuAgentEvent event)
>>>>  {
>>>> +    virObjectLock(mon);
>>>> +
>>>>      VIR_DEBUG("mon=%p event=%d await_event=%d", mon, event, mon->await_event);
>>>>      if (mon->await_event == event) {
>>>>          mon->await_event = QEMU_AGENT_EVENT_NONE;
>>>> @@ -1257,6 +1259,8 @@ void qemuAgentNotifyEvent(qemuAgentPtr mon,
>>>>              virCondSignal(&mon->notify);
>>>>          }
>>>>      }
>>>> +
>>>> +    virObjectUnlock(mon);
>>>>  }
>>>>  
>>>>  VIR_ENUM_DECL(qemuAgentShutdownMode);
>>>>




More information about the libvir-list mailing list