[libvirt] [PATCH v1 1/3] adding unplug_timeout QEMU conf
Daniel Henrique Barboza
danielhb413 at gmail.com
Thu Aug 29 20:43:24 UTC 2019
Bump
Does anyone else want to weight in on this? Jano suggested that
exposing the timeout configuration is not the best plan, and I don't
have a strong opinion about it. I even mentioned below that I'm ok
with the idea of simply increasing the timeout to 15 seconds and
changing the error message a bit to let users know that this is
not an error per-se and the user should see the guest to see
the result.
If no one else have feelings for this matter, I'll simply re-sent the
series with the change I just mentioned. I might also keep the
unplug settings in the config object to get rid of qemu_hotplugpriv.h
(patch 3/3).
Thanks,
DHB
On 8/19/19 7:07 AM, Daniel Henrique Barboza wrote:
> Hi,
>
> On 8/19/19 6:16 AM, Ján Tomko wrote:
>> On Sun, Aug 18, 2019 at 06:45:29PM -0300, Daniel Henrique Barboza wrote:
>>> For some architectures and setups, device removal can take
>>> longer than the default 5 seconds. This results in commands
>>> such as 'virsh setvcpus' to fire timeout messages even if
>>> the actual operation happened in the guest, causing confusion
>>> for the user.
>>>
>>
>> The commit that introduced this error message:
>> commit e3229f6e4461cd1721dc68a32e16ab1718ae716e
>> qemu: hotplug: Add support for VCPU unplug
>>
>> specifically says that we treat this differently than regular device
>> detach:
>>
>> As the new code is using device_del all the implications of using it
>> are present. Contrary to the device deletion code, the vcpu deletion
>> code fails if the unplug request is not executed in time.
>>
>> Technically, we already did execute the unplug request so we lie to the
>> user saying "operation failed".
>>
>> Maybe we can revisit the decision? [cc-ing pkrempa who added this]
>>
>
>
> I have thought about making setvcpus asynchronous when it is a
> device_del operation. This would be more code (perhaps a new command
> to do that? Or a --asynchronous option?), and we can set user
> expectations
> properly by making it clear that this is a unplug request command, and
> the
> user will need to check the result personally in the guest.
>
> But then, if we are prepared to tell the user "go check yourself" we
> can simply
> change the current timeout message to say "vcpu unplug [...] timeout,
> check unplug status in the guest". This would be clearer, and the user
> wouldn't automatically assume that timeout == operation failed.
>
> Another thing we can do, instead of exposing the option to the user
> (which
> has a good potential for disaster - hence why I added a minimal
> value), is
> to simply set the timeout to a greater value (10? 15?) and be done
> with it.
> If we set the timeout to 15 seconds and change the timeout message to
> let the user know "we don't know, you'll need to check", like I said
> above,
> we have more resilience and will not alarm the user if a timeout still
> occurs.
> We'll also avoid exposing the timeout configuration like I'm doing here.
>
>
>>> This patch adds a new qemu.conf parameter called 'unplug_timeout'
>>> to handle these cases. If left unset, the current default
>>> timeout is used. To avoid user 'experimentation' with small
>>> timeouts, the current timeout is also the minimal value
>>> allowed.
>>>
>>
>> The reason for this timeout is that we originally promised something
>> that we cannot deliver - a synchronous device detach API, while the
>> operation itself is asynchronous. I'm not a fan of exposing it and
>> making it configurable.
>>
>>> Signed-off-by: Daniel Henrique Barboza <danielhb413 at gmail.com>
>>> ---
>>> src/qemu/libvirtd_qemu.aug | 3 +++
>>> src/qemu/qemu.conf | 4 ++++
>>> src/qemu/qemu_conf.c | 26 ++++++++++++++++++++++++++
>>> src/qemu/qemu_conf.h | 5 +++++
>>> src/qemu/qemu_driver.c | 2 ++
>>> src/qemu/test_libvirtd_qemu.aug.in | 1 +
>>> 6 files changed, 41 insertions(+)
>>>
>>
>> [...]
>>
>>> diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h
>>> index 0cbddd7a9c..29824e4e35 100644
>>> --- a/src/qemu/qemu_conf.h
>>> +++ b/src/qemu/qemu_conf.h
>>> @@ -214,6 +214,8 @@ struct _virQEMUDriverConfig {
>>> gid_t swtpm_group;
>>>
>>> char **capabilityfilters;
>>> +
>>> + unsigned int unplugTimeout;
>>> };
>>>
>>> /* Main driver state */
>>> @@ -294,6 +296,9 @@ struct _virQEMUDriver {
>>>
>>> /* Immutable pointer, self-locking APIs */
>>> virHashAtomicPtr migrationErrors;
>>> +
>>> + /* Immutable value */
>>> + unsigned int unplugTimeout;
>>> };
>>>
>>
>> Why store this value twice?
>
> I wanted the value to be available at the driver object, but saw that the
> parsing of the reading file put stuff in config.
>
> However, just realized that we get to the cfg via qemu_driver->config
> (as long as the lock is being held, which I think it's the case for
> all unplug
> operations).
>
> In case we still want this configurable timeout solution, I'll fix
> this in the
> next spin.
>
>
> Thanks,
>
>
> DHB
>
>
>>
>> Jano
>
More information about the libvir-list
mailing list