[libvirt] [PATCH v1 1/3] adding unplug_timeout QEMU conf

Daniel Henrique Barboza danielhb413 at gmail.com
Thu Aug 29 20:43:24 UTC 2019


Bump

Does anyone else want to weight in on this? Jano suggested that
exposing the timeout configuration is not the best plan, and I don't
have a strong opinion about it. I even mentioned below that I'm ok
with the idea of simply increasing the timeout to 15 seconds and
changing the error message a bit to let users know that this is
not an error per-se and the user should see the guest to see
the result.

If no one else have feelings for this matter, I'll simply re-sent the
series with the change I just mentioned. I might also keep the
unplug settings in the config object to get rid of qemu_hotplugpriv.h
(patch 3/3).


Thanks,


DHB


On 8/19/19 7:07 AM, Daniel Henrique Barboza wrote:
> Hi,
>
> On 8/19/19 6:16 AM, Ján Tomko wrote:
>> On Sun, Aug 18, 2019 at 06:45:29PM -0300, Daniel Henrique Barboza wrote:
>>> For some architectures and setups, device removal can take
>>> longer than the default 5 seconds. This results in commands
>>> such as 'virsh setvcpus' to fire timeout messages even if
>>> the actual operation happened in the guest, causing confusion
>>> for the user.
>>>
>>
>> The commit that introduced this error message:
>>    commit e3229f6e4461cd1721dc68a32e16ab1718ae716e
>>        qemu: hotplug: Add support for VCPU unplug
>>
>> specifically says that we treat this differently than regular device
>> detach:
>>
>>    As the new code is using device_del all the implications of using it
>>    are present. Contrary to the device deletion code, the vcpu deletion
>>    code fails if the unplug request is not executed in time.
>>
>> Technically, we already did execute the unplug request so we lie to the
>> user saying "operation failed".
>>
>> Maybe we can revisit the decision? [cc-ing pkrempa who added this]
>>
>
>
> I have thought about making setvcpus asynchronous when it is a
> device_del operation. This would be more code (perhaps a new command
> to do that? Or a --asynchronous option?), and we can set user 
> expectations
> properly by making it clear that this is a unplug request command, and 
> the
> user will need to check the result personally in the guest.
>
> But then, if we are prepared to tell the user "go check yourself" we 
> can simply
> change the current timeout message to say "vcpu unplug [...] timeout,
> check unplug status in the guest". This would be clearer, and the user
> wouldn't  automatically assume that timeout == operation failed.
>
> Another thing we can do, instead of exposing the option to the user 
> (which
> has a good potential for disaster - hence why I added a minimal 
> value), is
> to simply set the timeout to a greater value (10? 15?) and be done 
> with it.
> If we set the timeout to 15 seconds and change the timeout message to
> let the user know "we don't know, you'll need to check", like I said 
> above,
> we have more resilience and will not alarm the user if a timeout still 
> occurs.
> We'll also avoid exposing the timeout configuration like I'm doing here.
>
>
>>> This patch adds a new qemu.conf parameter called 'unplug_timeout'
>>> to handle these cases. If left unset, the current default
>>> timeout is used. To avoid user 'experimentation' with small
>>> timeouts, the current timeout is also the minimal value
>>> allowed.
>>>
>>
>> The reason for this timeout is that we originally promised something
>> that we cannot deliver - a synchronous device detach API, while the
>> operation itself is asynchronous. I'm not a fan of exposing it and
>> making it configurable.
>>
>>> Signed-off-by: Daniel Henrique Barboza <danielhb413 at gmail.com>
>>> ---
>>> src/qemu/libvirtd_qemu.aug         |  3 +++
>>> src/qemu/qemu.conf                 |  4 ++++
>>> src/qemu/qemu_conf.c               | 26 ++++++++++++++++++++++++++
>>> src/qemu/qemu_conf.h               |  5 +++++
>>> src/qemu/qemu_driver.c             |  2 ++
>>> src/qemu/test_libvirtd_qemu.aug.in |  1 +
>>> 6 files changed, 41 insertions(+)
>>>
>>
>> [...]
>>
>>> diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h
>>> index 0cbddd7a9c..29824e4e35 100644
>>> --- a/src/qemu/qemu_conf.h
>>> +++ b/src/qemu/qemu_conf.h
>>> @@ -214,6 +214,8 @@ struct _virQEMUDriverConfig {
>>>     gid_t swtpm_group;
>>>
>>>     char **capabilityfilters;
>>> +
>>> +    unsigned int unplugTimeout;
>>> };
>>>
>>> /* Main driver state */
>>> @@ -294,6 +296,9 @@ struct _virQEMUDriver {
>>>
>>>     /* Immutable pointer, self-locking APIs */
>>>     virHashAtomicPtr migrationErrors;
>>> +
>>> +    /* Immutable value */
>>> +    unsigned int unplugTimeout;
>>> };
>>>
>>
>> Why store this value twice?
>
> I wanted the value to be available at the driver object, but saw that the
> parsing of the reading file put stuff in config.
>
> However, just realized that we get to the cfg via qemu_driver->config
> (as long as the lock is being held, which I think it's the case for 
> all unplug
> operations).
>
> In case we still want this configurable timeout solution, I'll fix 
> this in the
> next spin.
>
>
> Thanks,
>
>
> DHB
>
>
>>
>> Jano
>




More information about the libvir-list mailing list