[libvirt] [PATCH v1] libvirtd: Increase NL buffer size for lots of interface

Mon Jan 11 06:52:43 UTC 2016

[libvirt][PATCH v2] libvirtd: Increase NL buffer size for lots of 
interface has been sent.
Appreciated your comment and review.  Thanks

On 2016年01月07日 02:25, Laine Stump wrote:
> Now that everyone is back from holidays, I wanted to revisit this 
> patch (since the message I'm replying to contains most of the 
> discussion, but not the original patch, I've attached it at the end) 
> and hopefully get input from Tomas Graf about whether or not the 
> problems with netlink/libnl noted in the commit message that prevent 
> proper operation of message peeking are eliminated in recent versions 
> (if so, then I think the "128k buffer + enable peeking" you propose 
> will be a proper permanent fix.)
>
> Also, in the message referenced below I had asked a question about the 
> "LTC-Bugzilla" references, and pointed out that your patch only works 
> with libnl3 (not libnl-1.1) and only fixes the problem in one of 
> several places that libvirt uses libnl to create a netlink socket in 
> this message, then suggested a method of making a general purpose fix 
> for all uses an all versions of libnl:
>
> https://www.redhat.com/archives/libvir-list/2015-December/msg00697.html
>
> Can you create a new version of the patch that provides a general 
> solution as suggested (or similar) and either gives a clickable link 
> to the referenced LTC-Bugzilla issues or removes the reference (former 
> is preferred)?
>
> Thanks!
>
> On 12/20/2015 05:06 PM, Laine Stump wrote:
>> On 12/18/2015 02:30 AM, Leno Hou wrote:
>>>
>>> On 2015年12月17日 02:33, Laine Stump wrote:
>>>> On 12/16/2015 10:24 AM, Michal Privoznik wrote:
>>>>> On 10.12.2015 07:34, Leno Hou wrote:
>>>>>> 1. When switching CPUs to offline/online in a system more than 
>>>>>> 128 cpus
>>>>>> 2. When using virsh to destroy domain in a system with more 
>>>>>> interface
>>>>>>
>>>>>> All of above happens nl_recv returned with error: No buffer space 
>>>>>> available.
>>>>>> This patch set socket buffer size to 128K and turn on message 
>>>>>> peeking for nl_recv,
>>>>>> as this would solve this problem totally and permanetly.
>>>>>>
>>>>
>>>>>> Signed-off-by: Leno Hou <houqy at linux.vnet.ibm.com>
>>>>>> Cc: Wenyi Gao <wenyi at linux.vnet.ibm.com>
>>>>>> ---
>>>>>>   src/util/virnetlink.c | 8 ++++++++
>>>>>>   1 file changed, 8 insertions(+)
>>>>>>
>>>>>> diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c
>>>>>> index 679b48e..c8c9fe0 100644
>>>>>> --- a/src/util/virnetlink.c
>>>>>> +++ b/src/util/virnetlink.c
>>>>>> @@ -696,6 +696,14 @@ virNetlinkEventServiceStart(unsigned int 
>>>>>> protocol, unsigned int groups)
>>>>>>           goto error_server;
>>>>>>       }
>>>>>>   +    if (nl_socket_set_buffer_size(srv->netlinknh, 131702, 0) < 
>>>>>> 0) {
>>>>
>>>> The above function doesn't exist in libnl 1.1 (still used in 
>>>> RHEL6/CentOS6, for example), so that would cause a build failure on 
>>>> some systems. In libnl 1.1 the function is called 
>>>> nl_set_buffer_size().
>>>>
>>>> Also, how did you arrive at 128k for the default buffer size? What 
>>>> kind of sizes are you seeing?
>>> This buffer size is the receive socket buffer size. When I switching 
>>> CPUs to offline/online, it's receives 107K.
>>
>> Wow! What is in this message, and where/how is it used in the code?
>>
>>>
>>>>
>>>>
>>>>>> + virReportSystemError(errno,
>>>>>> +                "%s",_("cannot set netlink socket buffer size to 
>>>>>> 128k"));
>>>>>> +        goto error_server;
>>>>>> +    }
>>>>>> +
>>>>>> +    nl_socket_enable_msg_peek(srv->netlinknh);
>>>>>> +
>>>>
>>>> According to a link I followed from another message on this topic 
>>>> last week, libnl's message peeking can't be guaranteed to always 
>>>> work, because netlink doesn't always return the proper buffer size 
>>>> (depending on version).
>>> I would not use message peeking  due to I agree with you on that 
>>> message peeking can't be guaranteed to always work.
>>
>> I'm hoping that is only for "older" versions of kernel/netlink/libnl. 
>> Thomas?
>>
>>>>
>>>>>>       if ((srv->eventwatch = virEventAddHandle(fd,
>>>>>> VIR_EVENT_HANDLE_READABLE,
>>>>>> virNetlinkEventCallback,
>>>>>>
>>>>> I believe this patch appears over and over again. Usually, the 
>>>>> problem
>>>>> was in libnl library we use and this was just a workaround. Can 
>>>>> you test
>>>>> with the latest libnl version (probably even GIT HEAD) and see if 
>>>>> that
>>>>> helps?
>>>>
>>>> I had the same memory. So I just looked back through the history of 
>>>> bug reports about this issue, and found the following:
>>>>
>>>> * libnl-1.1 and libnl-3 both originally set the default message 
>>>> buffersize to 4096 bytes, with MSG_PEEK turned off.
>>>>
>>>> * when this problem came up in RHEL6, it was unfortunately reported 
>>>> as a private BZ (a pet peeve of mine), and the result of the 
>>>> discussion about it was that libnl-1.1 (the version used in RHEL6) 
>>>> was patched *upstream* to set the default message buffersize to 
>>>> 16384 bytes (getpagesize() * 4), which would solve the problem for 
>>>> even very large numbers of VFs. That was in 2013 and there have 
>>>> been no further reports against RHEL6.
>>>>
>>>> * Although I had assumed the problem was solved, it again came up 
>>>> in RHEL7 (which uses libnl-3 - a slightly different API, and 
>>>> maintained in a separate git repo), this time in a public BZ:
>>>>
>>>>   https://bugzilla.redhat.com/show_bug.cgi?id=1040626
>>> I'm not familar with this bug :-(    I think this maybe different 
>>> from my case.
>>
>> Well, I wasn't even aware that libvirt's netlink service was used for 
>> CPU hotplug (and still don't see how/where that happens) :-)
>>
>> The above bug reports were due to 4096 bytes not being enough to read 
>> an entire  IFLA_VF_PORTS array.
>>
>>
>>>
>>>>
>>>> I asked if perhaps the change that had been made upstream in 
>>>> libnl-1.1 hadn't also been made to libnl-3 (this is what I assumed 
>>>> during the previous incident). It hadn't. So the same change was 
>>>> made for libnl-3, both upstream and as a backport to RHEL7, and 
>>>> everyone was happy.
>>>
>>>> I have very little detailed memory of that time (the above was all 
>>>> recalled by looking at archives of discussions) but what had stuck 
>>>> in my mind was "This problem has been fixed in libnl, so libvirt 
>>>> should NOT put in "workarounds" for broken versions of libnl."
>>>>
>>>> But if you are using a version of libnl3 with this patch (which was 
>>>> in libnl-3.2.22 upstream, and is in the libnl-3.2.21 that's in 
>>>> RHEL7.0+), :
>>>>
>>>> https://github.com/tgraf/libnl/commit/807fddc4cd9ecb12ba64e1b7fa26d86b6c2f19b0 
>>>>
>>> This patch sets *message* buffer size to 4 pages.  But I'm prefer to 
>>> say I wanna to set *socket* buffer size to 128k
>>>
>>>>
>>>> then the change to quadruple the buffer size in libnl was 
>>>> insufficient, (and also, when I looked back at the discussion now, 
>>>> > I see that the libnl maintainer had said "The permanent fix would 
>>>> be for libvirt to enable message peeking", so I suppose it's time 
>>>> to "bite the bullet" and enable netlink message 
>>> Yes, I would to set socket buffer size in libvirt because If we can 
>>> set socket buffer size in libnl, it's can give impact on the app 
>>> that do not need 128K socket buffer size.
>>
>> Yes. 16K is one thing, but 128K is quite a lot more, and could more 
>> easily cause a problem with other applications.
>>
>>>
>>>> peeking in libvirt (but, since there are apparently versions of 
>>>> netlink that don't properly inform libnl when a re-read is 
>>>> necessary, we also need to increase the default buffer size).
>>>>
>>>> However, your patch is only fixing the problem in one place. There 
>>>> are several places that we allocate netlink sockets, and they 
>>>> should all get the same fix, implying that there should be a common 
>>>> function called by all three. Fortunately, we already have a macro 
>>>> called "virNetlinkAlloc" that is #defined differently depending on 
>>>> the libnl version - this macro can simply be made into a static 
>>>> function that is defined differently depending on libnl version. It 
>>>> can call nl_handle_alloc or nl_socket_alloc depending on libnl 
>>>> version, then call nl_socket_set_buffer_size/nl_set_buffer_size 
>>>> depending on version, and finally call nl_socket_enable_msg_peek.
>>>>
>>>> A bit of due diligence about the default buffer size is in order 
>>>> though - just to make sure that we don't open a ton of netlink 
>>>> sockets at the same time, each with an unnecessarily huge buffer.
>>>>
>>> Does anyone has an good idea to solve this problem permanently? 
>>> i.e.  Dynamically allocate the socket buffer size accordingly. Thanks
>>>
>>
>> I *think* (based on the things I've read, not on personal experience) 
>> that sufficiently new versions of libnl and kernel *do* properly 
>> implement message peeking, so we can just start off with a buffer 
>> size larger than what we think will ever be needed on a current 
>> system, and also enable message peeking, which should future-proof 
>> the code (so, basically what you've done). I'm Cc'ing Thomas Graf 
>> (from libnl) for his opinion though.
>