[libvirt] [PATCH 0/4]

Daniel Veillard veillard at redhat.com
Tue Aug 17 21:10:20 UTC 2010


On Tue, Aug 17, 2010 at 05:00:22PM +0100, Daniel P. Berrange wrote:
> For 
> 
>   https://bugzilla.redhat.com/show_bug.cgi?id=620847
> 
> We have had sporadic reports of
> 
>   # virsh capabilities
>   error: failed to get capabilities
>   error: server closed connection:
> 
> This normally means that libvirtd has crashed, closing the connection
> but in this case libvirtd has always remained running. It turns out
> that the capabilities XML was too large for the remote RPC message
> size. This caused XDR serialization to fail. This caused libvirtd to
> close the client connection immediately. The cause of the large XML
> was node handling an edge case in libnuma where it returns a CPU mask
> of all-1s to indicate a non-existant node.
> 
> Machines that exhibit this problem will show this as a symptom in
> the logs
> 
>  # grep NUMA /var/log/messages 
>  Aug 16 10:30:34 sgi-xe270-01 libvirtd: 10:30:34.933: warning : 
>  nodeCapsInitNUMA:388 : NUMA topology for cell 1 of 2 not available, ignoring
> 
> And have sparse NUMA topology (ie empty nodes)
> 
> This series does many things:
> 
>  - Adds explicit warnings in places where XDR serialization fails,
>    so we see an indication of problem in /var/log/messages
>  - Try to send a real remote_error back to client, instead of
>    closing its connection
>  - Add logging of capabilities XML in libvirt.c so we can identify
>    the too large doc in libvirtd
>  - Add fix to cope with all-1s node mask
> 
> This may also fix some other unexplained bug reports we've had with
> 'server closed connection' messages, or at least make it possible
> to diagnose them

  ACK for the 4 patches, as for upstream,

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel at veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/




More information about the libvir-list mailing list