[libvirt] [sandbox] random hang in g_poll

Daniel P. Berrange berrange at redhat.com
Mon Jun 18 09:04:40 UTC 2012


On Sat, Jun 16, 2012 at 03:25:40PM +0300, Radu Caragea wrote:
> Hello,
> I am building a tool on top of libvirt-sandbox and I've noticed that
> there are random hangs in the g_main_loop_run call. Normally, only
> very rarely does it actually hang, the problem appears let's say 4 out
> of 5 times when I run the program under valgrind or when I am using an
> LD_PRELOAD for checking gobject ref counts (gobject-list). I ran these
> directly on virt-sandbox to ensure that it's not from my code and it
> still happens.
> Attached is the output of a run that hangs followed by the backtrace.
> As I'm new to the gmainloop mechanisms I don't really know where to
> start debugging. Any ideas would be appreciated.

If you see a stack trace ending up in 'g_poll' this means that we
are waiting for incoming I/O, or waiting for a socket to become
writable. As such it isn't really a hang in libvirt-sandbox.


[snip]

> [    0.558902] virtio-pci 0000:00:04.0: irq 44 for MSI/MSI-X
> [    0.560380] virtio-pci 0000:00:04.0: irq 45 for MSI/MSI-X
> [    0.704556] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input2
> [    1.140052] Switching to clocksource tsc

The fact that the last message is about clock sources, makes me
very suspicious about the possibility that the guest OS itself
has hung. This would also tally up with the fact that you say
it is quite random / unpredictable.


> Program received signal SIGTRAP, Trace/breakpoint trap.
> 0x00000000097152b8 in __GI___poll (fds=0xcfc5400, nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:83
> 83          return INLINE_SYSCALL (poll, 3, CHECK_N (fds, nfds), nfds, timeout);
> (gdb) bt
> #0  0x00000000097152b8 in __GI___poll (fds=0xcfc5400, nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:83
> #1  0x000000000935faef in g_poll (fds=0xcfc5400, nfds=3, timeout=-1) at gpoll.c:132
> #2  0x000000000934f217 in g_main_context_poll (context=0xcf46ec0, timeout=-1, priority=2147483647, fds=0xcfc5400, n_fds=3)
>     at gmain.c:3440
> #3  0x000000000934eb6c in g_main_context_iterate (context=0xcf46ec0, block=1, dispatch=1, self=0xcfec630) at gmain.c:3141
> #4  0x000000000934efc0 in g_main_loop_run (loop=0xcf35320) at gmain.c:3340
> #5  0x0000000000402b9d in main (argc=1, argv=0x7fefffa18) at virt-sandbox.c:249


I think that we probably ought to make the libvirt-sandbox KVM builder
add the following XML config to its guests.

  <clock offset='utc'>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='rtc' tickpolicy='catchup'/>
  </clock>

I'm not sure if it'll fix this particular hang, but I think it is in
general a good thing

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list