[libvirt] [PATCH 03/10] qemu: agent: clear error flag upon init

Nikolay Shirokovskiy nshirokovskiy at virtuozzo.com
Thu Oct 27 10:54:16 UTC 2016



On 26.10.2016 22:53, John Ferlan wrote:
> 
> 
> On 10/04/2016 02:56 AM, Nikolay Shirokovskiy wrote:
>> There were a few errors in the code when this flag was not
>> cleared upon monitor cleanup. All of them could be fixed
>> just resetting this flag upon agent monitor initialization.
> 
> We should fix the places where the flag wasn't cleared properly then.

Well, the first 2 patches are exactly for this purpose.

> 
>> ---
>>  src/qemu/qemu_process.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>> index 3637f4b..42f7f84 100644
>> --- a/src/qemu/qemu_process.c
>> +++ b/src/qemu/qemu_process.c
>> @@ -210,6 +210,8 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm)
>>      qemuAgentPtr agent = NULL;
>>      virDomainChrDefPtr config = qemuFindAgentConfig(vm->def);
>>  
>> +    priv->agentError = false;
>> +
> 
> This idea actually may have some merit and may actually be the cause of
> the bug you saw, but not right here.
> 
>>      if (!config)
>>          return 0;
>>  
>>
> 
> I think perhaps a better place would be some time after we ascertain
> that "priv->agent" is not already set since the only way to set it is as
> a result of this function.
> 
> My first inclination was in that VSERPORT_CHANGE check and return 0
> (e.g. if called from qemuMigrationFinish, qemuProcessReconnect,
> qemuProcessLaunch, or qemuProcessAttach).
> 
> But then I wasn't sure if there would be a race between setting 'state'
> to VIR_DOMAIN_CHR_DEVICE_STATE_CONNECTED in processSerialChangedEvent
> and one of the 4 other paths.
> 
> So, I think perhaps the better place would be after the virObjectRef(vm)
> prior to calling qemuAgentOpen and unlocking the vm. IOW: We have the
> lock, alter priv->agentError.  We're about to start anyway...

If we fail to clean up the error flag correctly (patches 1 and 2 are dropped)
on shutdown then cleaning it up in this place will be too late. For example
in case we have QEMU_CAPS_VSERPORT_CHANGE then until serial port event
we will be in error state while it is unconnected really.

Resetting the error flag in this function is really a hack, a protection
from errors like in patch 1 or 2. After thinking a while I would 
event put resetting before 

    if (!config)                                                                
        return 0;   

check. Consider next situation:

1. shutown and the error flag is set after shutdown as described earlier
2. config is changed so that agent channel is removed
3. domain start
4. now as the flag is set, on attempt to use agent we get the same
"QEMU guest agent is not available due to an error" message while 
agent is just not configured.

> 
> That way, we'd know we're about the unconditionally clear the agentError
> just as if we were starting the first time, reconnecting to a running
> domain, migrating, or attaching.
> 
> Thus ignoring patch 1 and 2 for a moment and considering this patch
> alone, we'd know for sure the agentError was cleared before a "real"
> startup and after a qemuProcessStop.  Thus clearing before we start
> seems to be right (although it makes me wonder how it could have been set).
> 
> Personally, I'd start with applying patch 4 and then trying to reproduce
> the condition... Have some sort of "marker" here that checks if
> agentError is true then complain.  If we can figure out how we get into
> this function with agentError being set, then I think that'll really
> help me understand the order of events...
> 
> 
> John
> 




More information about the libvir-list mailing list