[libvirt] Zombie process after open libvirt connection

Michal Privoznik mprivozn at redhat.com
Wed Mar 19 17:27:34 UTC 2014


On 19.03.2014 12:10, Carlos Rodrigues wrote:
> Hello Michal,
>
> I am using libvirt 1.1.3 and perl-Sys-Virt 1.1.3 and perl-5.16 on Fedora
> 19 x86_64
>
> The zombie process appears after open libvirt connection with qemu-tls,
> and perl module is binding for libvirt library XS.
>
> Here is my running example with zombie process:
>
> $ perl test-chldhandle-bug-fixed.pl & sleep 15 && echo && ps axf | grep perl && echo
> [2] 12427
> init... pid=12427
> while...
> fork 1
> end... pid=12430
> receive chld
> fork 2
> end... pid=12431
> receive chld
> 2014-03-19 11:06:38.712+0000: 12427: info : libvirt version: 1.1.3.1, package: 2.fc19 (Unknown, 2014-03-17-15:02:00, cmar-laptop.lan)
> 2014-03-19 11:06:38.712+0000: 12427: warning : virNetTLSContextCheckCertificate:1140 : Certificate check failed Certificate [session] owner does not match the hostname 10.10.4.249
> connection open
> fork 3
> end... pid=12432
> fork 4
> end... pid=12440
>
> 12427 pts/2    S      0:00  |   \_ perl test-chldhandle-bug-fixed.pl
> 12432 pts/2    Z      0:00  |   |   \_ [perl] <defunct>
> 12440 pts/2    Z      0:00  |   |   \_ [perl] <defunct>
> 12442 pts/2    S+     0:00  |   \_ grep --color=auto perl

Aha! It seems like this is only present if using tls, I was unable to 
reproduce this with tcp or unix sockets. And when using tcp I can see 
SIGCHLD being delivered while with tls it is not. That makes me wonder 
if either libvirt or gnutls silently sets signal mask and not restore it 
back. Because if I take a look at signal mask I can clearly see SIGCHLD 
to be blocked (from /proc/$pid/status):

SigPnd: 0000000000000000
ShdPnd: 0000000000010000
SigBlk: 0000000008011000
SigIgn: 0000000000001080
SigCgt: 0000000180010000

What we can see here is, SigBlk (the bitmask of blocked signals) 
contains 0x801100 which is SIGPIPE, SIGCHLD and SIGWINCH. Right, why 
would libvirt care about SIGWINCH anyway? Git greping it leads us to 
virNetClientSetTLSSession(). I can clearly see there we are adding just 
those three signals to a mask. Then setting this mask just prior to 
calling poll() and then restoring back. Oh wait, we are not! 
pthread_sigmask(SIG_BLOCK,...) is just adding new signals to the mask, 
not overwriting the old one. So yes, this is clearly libvirt bug.

If I use SIG_SETMASK there, I am no longer getting any zombies. I'll 
post the patch shortly.

Michal




More information about the libvir-list mailing list