after a good discussion a few days ago in
and a short lived but back then untested v2 in
I finally get access to the right HW again and completed the series.
Being finally retested and working I finally feel safe to submit without
a RFC prefix. I think this would be a great addition for a better handling
of guests with plenty of host devices passed through.
With the new code in place I can shutdown systems that have 12, 16 or
even more hostdevs attached without getting into the "zombie" mode where
libvirt will forever consider the guest as "in shutdown" as it gave up
waiting too early because the signal zero still was able to reach it.
Scaling examples (extracted with gdb):
16 Devices: virProcessKillPainfullyDelay (pid=67096, force=true, extradelay=32)
12 Devices: virProcessKillPainfullyDelay (pid=68251, force=true, extradelay=24)
*Updates in v4*
- virDebug now reports the extradelay as requested (in seconds) and
thereby mostly matches the gdb output seen above
- header function prototype defines the variable name
- clarify the usage of delay units
- seconds (API call)
- 5th of seconds (internal poll loop)
- explain the request for 2*nhostdevs from the qemu shutdown code
*Updates in v3*
- fixup some issues found in testing and code checks
*Updates in v2*
- removed the "accept the lack of /proc/<pid> as valid process removal"
approach due to valid concerns about reusing ressources.
- added a dynamic extra wait scaling with the amount of hostdevs
Christian Ehrhardt (2):
process: wait longer on kill per assigned Hostdev
process: wait longer 5->30s on hard shutdown
FYI after there was no further feedback I pushed the v4 with the appropriate reviewed by tags.
Thanks everybody for your participation!
src/libvirt_private.syms | 1 +
src/qemu/qemu_process.c | 7 +++++--
src/util/virprocess.c | 22 ++++++++++++++++++----
src/util/virprocess.h | 3 +++
4 files changed, 27 insertions(+), 6 deletions(-)