[libvirt] Symptoms of main loop slowing down in libvirtd

Tue May 2 10:31:40 UTC 2017

Thanks for the quick response Peter !
This ratifies the basic approach I had in mind.
It needs some (not-so-small) cleanup of the qemu driver code, and I have
already started cleaning up some of it. I am planning to have a constant
number of event handler threads to start with. I'll try adding this as a
configurable parameter in qemu.conf once basic functionality is completed.

Thanks,
Prerna

On Tue, May 2, 2017 at 3:56 PM, Peter Krempa <pkrempa at redhat.com> wrote:

> (Dropped invalid address from cc-list)
>
> On Tue, May 02, 2017 at 15:33:47 +0530, Prerna wrote:
> > Hi all,
> > On my host, I have been seeing instances of keepalive responses slow down
> > intermittently when issuing bulk power offs.
> > With some tips from Danpb on the channel, I was able to trace via
> systemtap
> > that the main event loop would not run for about 6-9 seconds. This would
> > stall keepalives and kill client connections.
> >
> > I was able to trace it to the fact that qemuProcessHandleEvent() needed
> the
> > vm lock, and this was called from the main loop. I had hook scripts that
> > slightly elongated the time the power off RPC completed and the
> subsequent
> > keepalive delays were noticeable.
>
> I filed a bug about this a while ago:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1402921
>
> > I agree that the easiest solution is to unblock the Vm lock before hook
> > scripts are activated.
> > However, I was wondering why we contend on the per-Vm lock directly from
> > the main loop at all ? Can we do this instead : have the main loop "park"
> > events to a separate event queue, and then have a dedicated thread pool
> in
> > the qemu driver pick these raw events and then try grabbing the per-vm
> lock
> > for that VM ?
> > That way, we can be sure that the main event loop is _never_ delayed
> > irrespective of an RPC dragging on.
> >
> > If this sounds reasonable I will be happy to post the driver rewrite
> > patches to that end.
>
> And this is the solution I planed to do. Note that in worst case you
> need to have one thread per VM (if all are busy), but note that the
> thread pool should not be needlesly large. Requests for a single
> VM need to be queued with the same thread obviously.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20170502/62f8b808/attachment-0001.htm>