[libvirt] [PATCH v3 1/2] process: wait longer on kill per assigned Hostdev

Daniel P. Berrangé berrange at redhat.com
Tue Aug 21 12:32:00 UTC 2018


On Tue, Aug 21, 2018 at 02:20:21PM +0200, Christian Ehrhardt wrote:
> On Tue, Aug 21, 2018 at 1:15 PM Daniel P. Berrangé <berrange at redhat.com>
> wrote:
> 
> > On Tue, Aug 14, 2018 at 11:27:33AM +0200, Christian Ehrhardt wrote:
> > > It was found that in cases with host devices virProcessKillPainfully
> > > might be able to send signal zero to the target PID for quite a while
> > > with the process already being gone from /proc/<PID>.
> > >
> > > That is due to cleanup and reset of devices which might include a
> > > secondary bus reset that on top of the actions taken has a 1s delay
> > > to let the bus settle. Due to that guests with plenty of Host devices
> > > could easily exceed the default timeouts.
> > >
> > > To solve that, this adds an extra delay of 2s per hostdev that is
> > associated
> > > to a VM.
> > >
> > > Signed-off-by: Christian Ehrhardt <christian.ehrhardt at canonical.com>
> > > ---
> > >  src/libvirt_private.syms |  1 +
> > >  src/qemu/qemu_process.c  |  5 +++--
> > >  src/util/virprocess.c    | 18 +++++++++++++++---
> > >  src/util/virprocess.h    |  1 +
> > >  4 files changed, 20 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> > > index ca4a192a4a..47ea35f864 100644
> > > --- a/src/libvirt_private.syms
> > > +++ b/src/libvirt_private.syms
> > > @@ -2605,6 +2605,7 @@ virProcessGetPids;
> > >  virProcessGetStartTime;
> > >  virProcessKill;
> > >  virProcessKillPainfully;
> > > +virProcessKillPainfullyDelay;
> > >  virProcessNamespaceAvailable;
> > >  virProcessRunInMountNamespace;
> > >  virProcessSchedPolicyTypeFromString;
> > > diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> > > index 02fdc55156..b7bf8813da 100644
> > > --- a/src/qemu/qemu_process.c
> > > +++ b/src/qemu/qemu_process.c
> > > @@ -6814,8 +6814,9 @@ qemuProcessKill(virDomainObjPtr vm, unsigned int
> > flags)
> > >          return 0;
> > >      }
> > >
> > > -    ret = virProcessKillPainfully(vm->pid,
> > > -                                  !!(flags &
> > VIR_QEMU_PROCESS_KILL_FORCE));
> > > +    ret = virProcessKillPainfullyDelay(vm->pid,
> > > +                                       !!(flags &
> > VIR_QEMU_PROCESS_KILL_FORCE),
> > > +                                       vm->def->nhostdevs * 2);
> >
> > This API contract is a bit wierd. You've got an arbitray x2 multiplier
> > here...
> >
> 
> Out of the discussion with Alex I derived that due to the 1 sec settling of
> the bus plus some extra work 2 seconds per device would be a good rule of
> thumb.
> The delay is meant to be in seconds and here it requests 2*nhostdevs to get
> the amount needed.
> 
> And I'd not want to make the "unit" of the call "double-seconds" :-)
> 
> Let me add a comment at this call to explain the reasoning on the
> multiplier here.


> > > -virProcessKillPainfully(pid_t pid, bool force)
> > > +virProcessKillPainfullyDelay(pid_t pid, bool force, unsigned int
> > extradelay)
> > >  {
> > >      size_t i;
> > >      int ret = -1;
> > > +    unsigned int delay = 75 + (extradelay*5);
> >
> > ...and it gets another x5 multiplier here.
> >
> 
> That *5 is due to the internal unit it polls being 0.2 seconds.
> I think externally anything other than seconds (double-seconds,
> half-deca-seconds, ...) makes no sense.

Opps, yes, I mis-understood the behaviour of the code. A comment is fine


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvir-list mailing list