[libvirt] Loosing lxc guests when restarting libvirt
Guido Günther
agx at sigxcpu.org
Sat Dec 24 23:21:18 UTC 2016
On Sat, Dec 24, 2016 at 05:14:44PM +0100, Guido Günther wrote:
> Hi Cedric,x
> On Wed, Dec 21, 2016 at 02:36:39PM +0100, Cedric Bosdonnat wrote:
> > Hey Christian,
> >
> > On Tue, 2016-12-20 at 12:29 +0100, Christian Ehrhardt wrote:
> > > Hi,
> > > I found an issue in libvirt related to libvirt-lxc, but fail to find the root cause.
> > >
> > > The TL;DR is: libvirt-lxc guests get killed on libvirt restart due to "internal error: No valid cgroup for machine"
> > >
> > > It was able to reproduce libvirt 1.3.1, 2.4 and 2.5 as packages in Ubuntu and Debian.
> > > I wanted to ask for two things:
> > > - wider coverage where this does reproduce
> >
> > I couldn't reproduce here with openSUSE Tumbleweed and libvirt 2.5 packages.
>
> I had a short look and it seems like this sequence is killing all running
> libvirt-lxc guests reliably:
>
> # no lxc guest running yet
> export LIBVIRT_DEFAULT_URI=lxc:///
> DOMAIN=sl
> systemctl daemon-reload
>
> # start lxc guest
> virsh start ${DOMAIN}
> sleep 1 # give vm some time to start
> systemctl restart libvirtd
Using ftrae I can see that systemd moves the process into the wrong
cgroup on start:
systemd-1 [000] .... 652.333068: cgroup_attach_task: dst_root=3 dst_id=80 dst_level=2 dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc
systemd-1 [000] .... 652.333117: cgroup_attach_task: dst_root=3 dst_id=80 dst_level=2 dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc
systemd-1 [000] .... 652.333160: cgroup_attach_task: dst_root=6 dst_id=80 dst_level=2 dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc
systemd-1 [000] .... 652.333203: cgroup_attach_task: dst_root=4 dst_id=107 dst_level=2 dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc
systemd-1 [000] .... 652.333245: cgroup_attach_task: dst_root=8 dst_id=80 dst_level=2 dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc
systemd-1 [000] .... 652.333286: cgroup_attach_task: dst_root=7 dst_id=84 dst_level=2 dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc
I've attached the script to reproduce this and would be happy about
ideas of the root cause.
Cheers,
-- Guido
-------------- next part --------------
#!/bin/bash
set -e
export LIBVIRT_DEFAULT_URI=lxc:///
DOMAIN=sl
function cleanup () {
set +x
echo "Running cleanup"
echo 0 > /sys/kernel/debug/tracing/events/cgroup/enable
virsh -c lxc:/// destroy sl || true
if [ -n "$SUCCESS" ]; then
echo "Finished succesfully"
else
echo "Got an error."
fi
}
trap cleanup exit
cat <<EOF >dom.xml
<domain type='lxc'>
<name>sl</name>
<memory unit='KiB'>256000</memory>
<currentMemory unit='KiB'>256000</currentMemory>
<vcpu placement='static'>1</vcpu>
<os>
<type>exe</type>
<init>/bin/bash</init>
</os>
<features>
<privnet/>
</features>
<clock offset='utc'/>
<devices>
<filesystem type='mount' accessmode='passthrough'>
<source dir='/'/>
<target dir='/'/>
</filesystem>
<console type='pty'>
<target type='lxc' port='0'/>
</console>
</devices>
</domain>
EOF
virsh define dom.xml || true
echo 1 > /sys/kernel/debug/tracing/events/cgroup/enable
# Restart systemd, this triggers the problem
echo "systemctl deamon-reload start" > /sys/kernel/debug/tracing/trace_marker
systemctl daemon-reload
echo "systemctl deamon-reload finished" > /sys/kernel/debug/tracing/trace_marker
set -x
# Start the lxc container
echo "virsh start ${DOMAIN} start" > /sys/kernel/debug/tracing/trace_marker
virsh start ${DOMAIN}
echo "virsh start ${DOMAIN} finished" > /sys/kernel/debug/tracing/trace_marker
virsh list
PID=$(virsh -c lxc:/// list --state-running | sed -ne 's/ \([0-9]\+\) .*/\1/p')
WATCH=/proc/$PID/cgroup
echo "Before ${WATCH}"
cat ${WATCH}
sleep 1
# Restart libvirtd
echo "sysemctl stop libvirtd start" > /sys/kernel/debug/tracing/trace_marker
systemctl stop libvirtd
echo "sysemctl stop libvirtd finished" > /sys/kernel/debug/tracing/trace_marker
echo "sysemctl start libvirtd start" > /sys/kernel/debug/tracing/trace_marker
systemctl start libvirtd
echo "sysemctl start libvirtd finished" > /sys/kernel/debug/tracing/trace_marker
# Check if container is still there
echo "After"
cat ${WATCH}
if ! virsh list | grep -qs "${DOMAIN}[[:space:]]\+running"; then
echo 'Domain disappeared!'
exit 1
fi
SUCCESS=1
More information about the libvir-list
mailing list