[libvirt] help debugging event timing

Daniel P. Berrange berrange at redhat.com
Thu Feb 20 17:42:24 UTC 2014


On Wed, Jan 01, 2014 at 03:44:10PM -0700, Eric Blake wrote:
> In my recent work on events, I'm running into something where I'm
> stumped on timing behavior.  I'm using
> virEventRegisterDefaultImpl()/virEventRunDefaultImpl() as my event loop,
> and trying to test that an event is firing correctly.
> 
> With the test:///default (or any other local URI), if I register an
> event callback function, the callback gets invoked almost right after
> the event happens.  This is the same behavior under test in our
> testsuite (in objecteventtest.c). But with qemu:///system (or any other
> remote URI), when I register an event callback function, my callback
> does not get invoked until I make some other RPC call on the connection.

Ok, that's definitely broken.

>  At first, I thought the events were not being wired up at all, and that
> I was just blocking forever because I missed something, but while
> stepping through gdb on the client side, I noticed that calling
> virConnectDomainEventDeregisterAny() in my shutdown code then indeed
> proceeded to go through the entire backlog of events that had been
> generated, but at that point I was no longer registered so my callback
> was not called.  So I ended up using virEventAddTimeout() to set up a
> periodic interrupt (at 10 seconds, slow enough to see the delays), where
> in the handler I called virConnectGetLibVersion() as a way to force RPC
> traffic and at least get events as a result of my timer.  If it matters,
> I was initially using keepalive_interval=-1 in
> /etc/libvirt/libvirtd.conf in order to avoid killing my debug sessions;
> but even when I switched to keepalive_interval=3, I'm still seeing the
> same delayed behavior (if the keepalive were to make a difference, I
> would have expected seeing the event at 3 seconds instead of my timer at
> 10 seconds).

Do you see the same behaviour with  the event-test.c program ? If
not its a sign of something not right in virsh I guess .

> I thought that the RPC protocol was supposed to allow asynchronous
> events, and with minimal downtime.  gdb in libvirtd says that
> remoteDispatchObjectEventSend() is called in a timely manner, but that
> merely queues things up, and I got lost trying to follow where
> virNetSocketUpdateIOCallback() turned into something that wakes up the
> actual write() to the socket.  On the other side, I don't see a call to
> virDomainEventDispatchDefaultFunc() which then calls my callback until
> the next RPC call/response, but again, I got lost on trying to trace
> whether a read() of the socket was getting data in a timely manner but
> just not firing anything in my event loop, or whether things were
> getting stuck client side after the read but with nothing waking up my
> event loop.  Any advice on how to better debug where things are getting
> stuck, to determine whether the holdup is in the server not sending or
> in the client not being responsive to the send?  Is it something where
> systemtap might give me a better picture?  Or better yet, a patch to fix
> the problem would be nice.

Yep, client registered callbacks should see the event the moment
it comes off the wire with negligble delay.

The  examples/systemtap/rpc-monitor.stp demo will snoop on all wire
events so you can see exactly when they are sent by the server and
queued for dispatch by the client.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list