[libvirt] [PATCH 0/4]
Daniel Veillard
veillard at redhat.com
Tue Aug 17 21:10:20 UTC 2010
On Tue, Aug 17, 2010 at 05:00:22PM +0100, Daniel P. Berrange wrote:
> For
>
> https://bugzilla.redhat.com/show_bug.cgi?id=620847
>
> We have had sporadic reports of
>
> # virsh capabilities
> error: failed to get capabilities
> error: server closed connection:
>
> This normally means that libvirtd has crashed, closing the connection
> but in this case libvirtd has always remained running. It turns out
> that the capabilities XML was too large for the remote RPC message
> size. This caused XDR serialization to fail. This caused libvirtd to
> close the client connection immediately. The cause of the large XML
> was node handling an edge case in libnuma where it returns a CPU mask
> of all-1s to indicate a non-existant node.
>
> Machines that exhibit this problem will show this as a symptom in
> the logs
>
> # grep NUMA /var/log/messages
> Aug 16 10:30:34 sgi-xe270-01 libvirtd: 10:30:34.933: warning :
> nodeCapsInitNUMA:388 : NUMA topology for cell 1 of 2 not available, ignoring
>
> And have sparse NUMA topology (ie empty nodes)
>
> This series does many things:
>
> - Adds explicit warnings in places where XDR serialization fails,
> so we see an indication of problem in /var/log/messages
> - Try to send a real remote_error back to client, instead of
> closing its connection
> - Add logging of capabilities XML in libvirt.c so we can identify
> the too large doc in libvirtd
> - Add fix to cope with all-1s node mask
>
> This may also fix some other unexplained bug reports we've had with
> 'server closed connection' messages, or at least make it possible
> to diagnose them
ACK for the 4 patches, as for upstream,
Daniel
--
Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/
daniel at veillard.com | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library http://libvirt.org/
More information about the libvir-list
mailing list