[libvirt] libvirt will wait 20 minutes or hang when the network interface down

Benjamin Wang (gendwang) gendwang at cisco.com
Mon Dec 24 09:21:35 UTC 2012


Hi Michal,
  In most cases the thread will wait about 20 minutes. I think that 20 minutes is not acceptable when a router which connects a lot of devices is down. Most important, some thread could hang. I am not sure whether
this is a curl bug. But they propose to use these two options to fix this problem.

B.R.
Benjamin Wang

-----Original Message-----
From: Michal Privoznik [mailto:mprivozn at redhat.com] 
Sent: 2012年12月24日 16:57
To: Benjamin Wang (gendwang)
Cc: libvir-list at redhat.com; James Ye (jiaye); Yang Zhou (yangzho)
Subject: Re: [libvirt] libvirt will wait 20 minutes or hang when the network interface down

On 22.12.2012 09:59, Benjamin Wang (gendwang) wrote:
> Hi,
> 
>   I find that when the network interface is down. In most scenarios, 
> the libvirt will wait 20 minutes and report the exception. In seldom 
> scenarios, the polling
> 
> thread will hang even if the network is recovered.
> 
> The following is the formal description from libcurl website:
> 
> http://curl.haxx.se/docs/faq.html  (Section “4.19 Why doesn't cURL 
> return an error when the network cable is unplugged?”)
> 
> The following is the similar case about thread hand:
> 
> http://curl.haxx.se/mail/lib-2010-07/0108.html
> 
>  
> 
> For “wait 20 minutes”, although this is the TCP normal mechanism, but 
> if a server manages tons of thousands of devices by libvirt. When 1000 
> devices are down,
> 
> This could cause thread leak for a long period.
> 
> For thread hang, this could cause thread leak forever.
> 
>  
> 
> I tried to add the following codes in esx_vi.c. It seems that these 
> code can avoid the above issues. Would you give your comments?
> 
>  
> 
>     *if*(curl->headers == NULL) {
> 
>         virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> 
>                        _("Could not build CURL header list"));
> 
>         *return*-1;
> 
>     }
> 
>  
> 
> +    curl_easy_setopt(curl->handle, CURLOPT_LOW_SPEED_LIMIT, 10);
> 
> +    curl_easy_setopt(curl->handle, CURLOPT_LOW_SPEED_TIME, 120);
> 
>  
> 
>     curl_easy_setopt(curl->handle, CURLOPT_USERAGENT, 
> "_libvirt_-_esx_");
> 
>     curl_easy_setopt(curl->handle, CURLOPT_HEADER, 0);
> 
>     curl_easy_setopt(curl->handle, CURLOPT_FOLLOWLOCATION, 0);
> 
> curl_easy_setopt(curl->handle, CURLOPT_SSL_VERIFYPEER,
> 
>  
> 
>  
> 
> B.R.
> 
> Benjamin Wang
> 

I wonder if this isn't a curl bug actually since it (must) know interface's down. That is, i think curl_easy_perform() which is wrapped in esxVI_CURL_Perform() should have returned an error.

Michal





More information about the libvir-list mailing list