[libvirt] [PATCH] phyp: too much timeout when polling socket
Daniel Veillard
veillard at redhat.com
Wed Nov 11 13:46:12 UTC 2009
On Wed, Nov 11, 2009 at 11:06:50AM +0000, Daniel P. Berrange wrote:
> On Wed, Nov 11, 2009 at 11:55:29AM +0100, Daniel Veillard wrote:
> > On Wed, Nov 11, 2009 at 03:11:31AM -0200, Eduardo Otubo wrote:
> > > Hello all,
> > >
> > > Since I moved to libssh2 I had noticed a weird behaviour, virsh was
> > > taking too much time to complete the operations when using phyp driver.
> > > Just found the problem, 10 seconds of timeout passed to select().
> > > Changed to zero, since I'm just polling the socket.
> > >
> > > Actually this patch is more important than it seems, now I can write a
> > > script (using virsh) to test all the phyp features.
> >
> > The problem is that you end up creating a busy loop by doing this,
> > and I would be surprized if that patch was actually correct. Somehow
> > somewhere in one of the loops calling waitsocket() you should not make
> > that call, this is what is blocking you and using gdb (or another
> > debugger) during that time out will allow you to find out exactly where
> > this is hapening. So rather than just dropping the ball and just looping
> > you should take that opportunity of a reproduceable bug to debug it :-)
> >
> > In the meantime moving the timeout from 10s to 1 millisecond by
> > changing tv_sec to 0 and tv_usec to 1000 is certainly a good
> > workaround, it will still avoid looping needlessly, and the 1ms extra
> > timeout should not break your scripts. I will change the code
> > accordingly, but you should debug your issue with the 10s delay (or even
> > increase it if it makes things easier to set up).
> >
> > So I pushed that modified fix but please debug the issue :-)
>
> I don't think that is correct really.
>
> This function is invoked when libssh2 gets EAGAIN, and needs to wait for
> more data to arrive on the socket. It should be waiting for select() to
> show the file handle is writable or readable, arguably using an inifinite
> timeout, particularly because no caller of waitsocket() ever bothers to
> check the return value for a timeout errors
The problem is that right now, the code doesn't call it only when
getting EAGAIN from libssh2 but in a number of various loops. It's
clearly called from places it shouldn't (hence the 10s timeouts), and
current patch is only a band aid. Hence my request to do the actual
debug.
Daniel
--
Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/
daniel at veillard.com | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library http://libvirt.org/
More information about the libvir-list
mailing list