[libvirt] Handling large amount of VF's with intelligent passthrough
Daniel P. Berrange
berrange at redhat.com
Tue Dec 8 16:04:47 UTC 2015
On Tue, Dec 08, 2015 at 05:58:15PM +0200, Jan Gutter wrote:
> Hi,
>
> I've run into a rather interesting problem recently where a weird
> interaction between libnl3 and libvirt caused some difficult-to-debug
> issues. From libvirt's side, the issue was that a netlink response was
> much larger than the pagesize and truncated by libnl3. When
> virNetDevLinkDump() calls virNetlinkCommand(), nl_recv() is supposed
> to return a rather large structure with information for all the
> Virtual Functions. When called in a system where the number of PCIe
> Virtual Functions are more than 30 for a given Physical Function, the
> netlink response is larger than 4k, meaning that a message is
> truncated. Unfortunately libnl3 truncates this silently, meaning that
> a cryptic error pops up much later in virNetDevParseVfConfig(),
> "missing IFLA_VF_INFO in netlink response".
>
> Aside from the error propagation (which might be fixable in libnl3),
> there still remains the need to enable libvirt to function in cases
> like this. This can be done in two ways, both in virNetlinkCommand().
I wishes we never used libnl and just did everything directly
to netlink sockets - libnl has been a never ending source of
bugs and breakage
> 1. Message peeking can be enabled. In theory this slows down any
> netlink messages by doing a two stage query: query the buffer size,
> then allocate the receive buffer and receive the message. This is a
> reliability/performance tradeoff, I guess.
Do you have a guage on what kind of performance penalty we're
talking about ? If this is not in a hot-path for libvirt I'd
be inclined to just accept the hit.
>
> This is as simple as adding:
>
> nl_socket_enable_msg_peek(nlhandle);
>
> 2. The receive buffer size can also be made larger:
>
> nl_socket_set_msg_buf_size(nlhandle, ARBITRARY_BUFFER_SIZE);
>
> This does not incur a performance penalty, but until libnl3 can
> propagate the truncation error, this merely postpones the error for
> future generations...
Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
More information about the libvir-list
mailing list