[libvirt-users] Zombie processes being created when console buffer is full

Peter Steele psteele at peaxy.net
Fri Jan 29 00:00:36 UTC 2016


We have been researching stuck zombie processes in our libvirt lxc 
containers.  What we found was:

1) Each zombie’s parent was pid 1.  init which symlinks to systemd.
2) In some cases, the zombies were launched by systemd, in others the 
zombie was inherited.
3) While the child is in the zombie state, the parent process (systemd) 
/proc/1/status shows no pending signals.
4) Attaching gdb to systemd, there was 1 thread and it was waiting in 
write() and the file being written was /dev/console.

This write() to the console never returns.  We operated under the 
assumption that systemd's SIGCHLD handler sets a bit and a foreground 
thread (the only thread) would see that child processes needed reaping. 
  While the single thread is stuck in write(), the reaping never takes 
place.

So why is write() blocking?  The answer seems to be that there is 
nothing draining the console and eventually it blocks write() when its 
buffers become full.  When we attached to the container's console, the 
buffer is cleared allowing systemd’s write() to return. The zombies are 
then reaped and everything goes back to normal.

Our “solution” was more of a workaround.  systemd was altered to log 
errors/warnings/etc to /dev/null instead of /dev/console.  This 
prevented the problem, only in that the console buffer was unlikely to 
get filled up since systemd generally is the only then that writes to 
it. This is definitely a hack though.

This may be a bug in the libvirt container library (you can't expect 
something to periodically connect to a container's console to empty it 
out). We suspect there may also be a configuration issue in our 
containers with regards to the console.

Has anyone else observed this problem?

Peter




More information about the libvirt-users mailing list