[libvirt] [PATCH v1] libvirtd: Increase NL buffer size for lots of interface
Laine Stump
laine at laine.org
Sun Dec 20 22:06:32 UTC 2015
On 12/18/2015 02:30 AM, Leno Hou wrote:
>
> On 2015年12月17日 02:33, Laine Stump wrote:
>> On 12/16/2015 10:24 AM, Michal Privoznik wrote:
>>> On 10.12.2015 07:34, Leno Hou wrote:
>>>> 1. When switching CPUs to offline/online in a system more than 128
>>>> cpus
>>>> 2. When using virsh to destroy domain in a system with more interface
>>>>
>>>> All of above happens nl_recv returned with error: No buffer space
>>>> available.
>>>> This patch set socket buffer size to 128K and turn on message
>>>> peeking for nl_recv,
>>>> as this would solve this problem totally and permanetly.
>>>>
>>
>>>> Signed-off-by: Leno Hou <houqy at linux.vnet.ibm.com>
>>>> Cc: Wenyi Gao <wenyi at linux.vnet.ibm.com>
>>>> ---
>>>> src/util/virnetlink.c | 8 ++++++++
>>>> 1 file changed, 8 insertions(+)
>>>>
>>>> diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c
>>>> index 679b48e..c8c9fe0 100644
>>>> --- a/src/util/virnetlink.c
>>>> +++ b/src/util/virnetlink.c
>>>> @@ -696,6 +696,14 @@ virNetlinkEventServiceStart(unsigned int
>>>> protocol, unsigned int groups)
>>>> goto error_server;
>>>> }
>>>> + if (nl_socket_set_buffer_size(srv->netlinknh, 131702, 0) < 0) {
>>
>> The above function doesn't exist in libnl 1.1 (still used in
>> RHEL6/CentOS6, for example), so that would cause a build failure on
>> some systems. In libnl 1.1 the function is called nl_set_buffer_size().
>>
>> Also, how did you arrive at 128k for the default buffer size? What
>> kind of sizes are you seeing?
> This buffer size is the receive socket buffer size. When I switching
> CPUs to offline/online, it's receives 107K.
Wow! What is in this message, and where/how is it used in the code?
>
>>
>>
>>>> + virReportSystemError(errno,
>>>> + "%s",_("cannot set netlink socket buffer size to
>>>> 128k"));
>>>> + goto error_server;
>>>> + }
>>>> +
>>>> + nl_socket_enable_msg_peek(srv->netlinknh);
>>>> +
>>
>> According to a link I followed from another message on this topic
>> last week, libnl's message peeking can't be guaranteed to always
>> work, because netlink doesn't always return the proper buffer size
>> (depending on version).
> I would not use message peeking due to I agree with you on that
> message peeking can't be guaranteed to always work.
I'm hoping that is only for "older" versions of kernel/netlink/libnl.
Thomas?
>>
>>>> if ((srv->eventwatch = virEventAddHandle(fd,
>>>> VIR_EVENT_HANDLE_READABLE,
>>>> virNetlinkEventCallback,
>>>>
>>> I believe this patch appears over and over again. Usually, the problem
>>> was in libnl library we use and this was just a workaround. Can you
>>> test
>>> with the latest libnl version (probably even GIT HEAD) and see if that
>>> helps?
>>
>> I had the same memory. So I just looked back through the history of
>> bug reports about this issue, and found the following:
>>
>> * libnl-1.1 and libnl-3 both originally set the default message
>> buffersize to 4096 bytes, with MSG_PEEK turned off.
>>
>> * when this problem came up in RHEL6, it was unfortunately reported
>> as a private BZ (a pet peeve of mine), and the result of the
>> discussion about it was that libnl-1.1 (the version used in RHEL6)
>> was patched *upstream* to set the default message buffersize to 16384
>> bytes (getpagesize() * 4), which would solve the problem for even
>> very large numbers of VFs. That was in 2013 and there have been no
>> further reports against RHEL6.
>>
>> * Although I had assumed the problem was solved, it again came up in
>> RHEL7 (which uses libnl-3 - a slightly different API, and maintained
>> in a separate git repo), this time in a public BZ:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1040626
> I'm not familar with this bug :-( I think this maybe different from
> my case.
Well, I wasn't even aware that libvirt's netlink service was used for
CPU hotplug (and still don't see how/where that happens) :-)
The above bug reports were due to 4096 bytes not being enough to read an
entire IFLA_VF_PORTS array.
>
>>
>> I asked if perhaps the change that had been made upstream in
>> libnl-1.1 hadn't also been made to libnl-3 (this is what I assumed
>> during the previous incident). It hadn't. So the same change was made
>> for libnl-3, both upstream and as a backport to RHEL7, and everyone
>> was happy.
>
>> I have very little detailed memory of that time (the above was all
>> recalled by looking at archives of discussions) but what had stuck in
>> my mind was "This problem has been fixed in libnl, so libvirt should
>> NOT put in "workarounds" for broken versions of libnl."
>>
>> But if you are using a version of libnl3 with this patch (which was
>> in libnl-3.2.22 upstream, and is in the libnl-3.2.21 that's in
>> RHEL7.0+), :
>>
>> https://github.com/tgraf/libnl/commit/807fddc4cd9ecb12ba64e1b7fa26d86b6c2f19b0
>>
> This patch sets *message* buffer size to 4 pages. But I'm prefer to
> say I wanna to set *socket* buffer size to 128k
>
>>
>> then the change to quadruple the buffer size in libnl was
>> insufficient, (and also, when I looked back at the discussion now, >
>> I see that the libnl maintainer had said "The permanent fix would be
>> for libvirt to enable message peeking", so I suppose it's time to
>> "bite the bullet" and enable netlink message
> Yes, I would to set socket buffer size in libvirt because If we can
> set socket buffer size in libnl, it's can give impact on the app that
> do not need 128K socket buffer size.
Yes. 16K is one thing, but 128K is quite a lot more, and could more
easily cause a problem with other applications.
>
>> peeking in libvirt (but, since there are apparently versions of
>> netlink that don't properly inform libnl when a re-read is necessary,
>> we also need to increase the default buffer size).
>>
>> However, your patch is only fixing the problem in one place. There
>> are several places that we allocate netlink sockets, and they should
>> all get the same fix, implying that there should be a common function
>> called by all three. Fortunately, we already have a macro called
>> "virNetlinkAlloc" that is #defined differently depending on the libnl
>> version - this macro can simply be made into a static function that
>> is defined differently depending on libnl version. It can call
>> nl_handle_alloc or nl_socket_alloc depending on libnl version, then
>> call nl_socket_set_buffer_size/nl_set_buffer_size depending on
>> version, and finally call nl_socket_enable_msg_peek.
>>
>> A bit of due diligence about the default buffer size is in order
>> though - just to make sure that we don't open a ton of netlink
>> sockets at the same time, each with an unnecessarily huge buffer.
>>
> Does anyone has an good idea to solve this problem permanently? i.e.
> Dynamically allocate the socket buffer size accordingly. Thanks
>
I *think* (based on the things I've read, not on personal experience)
that sufficiently new versions of libnl and kernel *do* properly
implement message peeking, so we can just start off with a buffer size
larger than what we think will ever be needed on a current system, and
also enable message peeking, which should future-proof the code (so,
basically what you've done). I'm Cc'ing Thomas Graf (from libnl) for his
opinion though.
More information about the libvir-list
mailing list