[libvirt] [PATCH] Fix error report from nl_recvmsg

Daniel P. Berrange berrange at redhat.com
Thu Feb 28 16:24:17 UTC 2013


On Thu, Feb 28, 2013 at 04:16:37PM +0000, Daniel P. Berrange wrote:
> On Thu, Feb 28, 2013 at 11:11:53AM -0500, Laine Stump wrote:
> > On 02/28/2013 08:37 AM, Daniel P. Berrange wrote:
> > > From: "Daniel P. Berrange" <berrange at redhat.com>
> > >
> > > The nl_recvmsg does not always set errno. Instead it returns
> > > its own custom set of error codes. Thus we were reporting the
> > > wrong data.
> > > ---
> > >  src/util/virnetlink.c | 5 +++--
> > >  1 file changed, 3 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c
> > > index 0b36fdc..8b47ede 100644
> > > --- a/src/util/virnetlink.c
> > > +++ b/src/util/virnetlink.c
> > > @@ -335,8 +335,9 @@ virNetlinkEventCallback(int watch,
> > >      if (length == 0)
> > >          return;
> > >      if (length < 0) {
> > > -        virReportSystemError(errno,
> > > -                             "%s", _("nl_recv returned with error"));
> > > +        virReportError(VIR_ERR_INTERNAL_ERROR,
> > > +                       _("nl_recv returned with error: %s"),
> > > +                       nl_geterror(length));
> > 
> > My recollection is that we specifically avoided calling nl_geterror()
> > because it isn't threadsafe.
> > 
> > I'll go take another look to verify.
> 
> I did check this, but only for libnl3 which merely does a static
> string table lookup:

Oh joy, it is worse than you could possibly imagine.

On libnl1 the return value is a valid -errno, while in libnl3
the return value is an error code of its own invention.

Further in libnl1 we can';t rely on the global errno, because
other calls libnl does may have overritten it with garbage.
We must use the return value from the function.

For yet more fun, libnl1's error handling is not threadsafe.
Whenever it hits an error with a syscall, it internally
runs  __nl_error() which mallocs/frees a static global
variable containing the contents of strerror() which is
itself also not threadsafe :-(

Did I mention we should just throw out all versions of libnl
entirely and talk to the kernel ourselves..... It has caused
us no end of pain in all its versions.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list