[libvirt] [PATCH v3 1/2] process: wait longer on kill per assigned Hostdev

Christian Ehrhardt christian.ehrhardt at canonical.com
Tue Aug 21 12:20:21 UTC 2018


On Tue, Aug 21, 2018 at 1:15 PM Daniel P. Berrangé <berrange at redhat.com>
wrote:

> On Tue, Aug 14, 2018 at 11:27:33AM +0200, Christian Ehrhardt wrote:
> > It was found that in cases with host devices virProcessKillPainfully
> > might be able to send signal zero to the target PID for quite a while
> > with the process already being gone from /proc/<PID>.
> >
> > That is due to cleanup and reset of devices which might include a
> > secondary bus reset that on top of the actions taken has a 1s delay
> > to let the bus settle. Due to that guests with plenty of Host devices
> > could easily exceed the default timeouts.
> >
> > To solve that, this adds an extra delay of 2s per hostdev that is
> associated
> > to a VM.
> >
> > Signed-off-by: Christian Ehrhardt <christian.ehrhardt at canonical.com>
> > ---
> >  src/libvirt_private.syms |  1 +
> >  src/qemu/qemu_process.c  |  5 +++--
> >  src/util/virprocess.c    | 18 +++++++++++++++---
> >  src/util/virprocess.h    |  1 +
> >  4 files changed, 20 insertions(+), 5 deletions(-)
> >
> > diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> > index ca4a192a4a..47ea35f864 100644
> > --- a/src/libvirt_private.syms
> > +++ b/src/libvirt_private.syms
> > @@ -2605,6 +2605,7 @@ virProcessGetPids;
> >  virProcessGetStartTime;
> >  virProcessKill;
> >  virProcessKillPainfully;
> > +virProcessKillPainfullyDelay;
> >  virProcessNamespaceAvailable;
> >  virProcessRunInMountNamespace;
> >  virProcessSchedPolicyTypeFromString;
> > diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> > index 02fdc55156..b7bf8813da 100644
> > --- a/src/qemu/qemu_process.c
> > +++ b/src/qemu/qemu_process.c
> > @@ -6814,8 +6814,9 @@ qemuProcessKill(virDomainObjPtr vm, unsigned int
> flags)
> >          return 0;
> >      }
> >
> > -    ret = virProcessKillPainfully(vm->pid,
> > -                                  !!(flags &
> VIR_QEMU_PROCESS_KILL_FORCE));
> > +    ret = virProcessKillPainfullyDelay(vm->pid,
> > +                                       !!(flags &
> VIR_QEMU_PROCESS_KILL_FORCE),
> > +                                       vm->def->nhostdevs * 2);
>
> This API contract is a bit wierd. You've got an arbitray x2 multiplier
> here...
>

Out of the discussion with Alex I derived that due to the 1 sec settling of
the bus plus some extra work 2 seconds per device would be a good rule of
thumb.
The delay is meant to be in seconds and here it requests 2*nhostdevs to get
the amount needed.

And I'd not want to make the "unit" of the call "double-seconds" :-)

Let me add a comment at this call to explain the reasoning on the
multiplier here.

> diff --git a/src/util/virprocess.c b/src/util/virprocess.c
> > index ecea27a2d4..46360cc051 100644
> > --- a/src/util/virprocess.c
> > +++ b/src/util/virprocess.c
> > @@ -341,15 +341,19 @@ int virProcessKill(pid_t pid, int sig)
> >   * Returns 0 if it was killed gracefully, 1 if it
> >   * was killed forcibly, -1 if it is still alive,
> >   * or another error occurred.
> > + *
> > + * Callers can proide an extra delay to wait longer
> > + * than the default.
>
> Mention that it is in "seconds"
>

Ack

>   */
> >  int
> > -virProcessKillPainfully(pid_t pid, bool force)
> > +virProcessKillPainfullyDelay(pid_t pid, bool force, unsigned int
> extradelay)
> >  {
> >      size_t i;
> >      int ret = -1;
> > +    unsigned int delay = 75 + (extradelay*5);
>
> ...and it gets another x5 multiplier here.
>

That *5 is due to the internal unit it polls being 0.2 seconds.
I think externally anything other than seconds (double-seconds,
half-deca-seconds, ...) makes no sense.

I'll call it "polldelay" instead to make it more clear that.
That uncommon *5 ratio was there before, just not as redable (it just
defined 75 in the loop header).
I'll add some text about it as well to be sure.


> Feels nicer to just have the caller provide a x10 multiplier and
> honour that directly
>

See above, the unit being seconds seems the most readable IMHO.
With the changes made I think it won't be as confusing.

>      const char *signame = "TERM";
> >
> > -    VIR_DEBUG("vpid=%lld force=%d", (long long)pid, force);
> > +    VIR_DEBUG("vpid=%lld force=%d delay=%u", (long long)pid, force,
> pid);
>
> s/pid/extradelay/
>

Yes, that was mentioned by Bjoern and queued that way.
Also in the text the delay= should be extradelay= which is ready that way.

>
> >      /* This loop sends SIGTERM, then waits a few iterations (10 seconds)
> >       * to see if it dies. If the process still hasn't exited, and
> > @@ -357,9 +361,12 @@ virProcessKillPainfully(pid_t pid, bool force)
> >       * wait up to 5 seconds more for the process to exit before
> >       * returning.
> >       *
> > +     * An extra delay can be specified for cases that are expected to
> clean
> > +     * up slower than usual.
> > +     *
> >       * Note that setting @force could result in dataloss for the
> process.
> >       */
> > -    for (i = 0; i < 75; i++) {
> > +    for (i = 0; i < delay; i++) {
> >          int signum;
> >          if (i == 0) {
> >              signum = SIGTERM; /* kindly suggest it should exit */
> > @@ -402,6 +409,11 @@ virProcessKillPainfully(pid_t pid, bool force)
> >  }
> >
> >
> > +int virProcessKillPainfully(pid_t pid, bool force)
> > +{
> > +    return virProcessKillPainfullyDelay(pid, force, 0);
> > +}
> > +
> >  #if HAVE_SCHED_GETAFFINITY
> >
> >  int virProcessSetAffinity(pid_t pid, virBitmapPtr map)
> > diff --git a/src/util/virprocess.h b/src/util/virprocess.h
> > index 3c5a882772..b72603ca8e 100644
> > --- a/src/util/virprocess.h
> > +++ b/src/util/virprocess.h
> > @@ -55,6 +55,7 @@ virProcessWait(pid_t pid, int *exitstatus, bool raw)
> >  int virProcessKill(pid_t pid, int sig);
> >
> >  int virProcessKillPainfully(pid_t pid, bool force);
> > +int virProcessKillPainfullyDelay(pid_t pid, bool force, unsigned int);
>
> Missing parameter name here
>

Ack, and then breaking it into new lines for 80 chars

Thanks all these will be part of a v4 ready after some syntax check and
test builds ...

Regards,
> Daniel
> --
> |: https://berrange.com      -o-
> https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-
> https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-
> https://www.instagram.com/dberrange :|
>


-- 
Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20180821/1d45df72/attachment-0001.htm>


More information about the libvir-list mailing list