[libvirt] [RFC PATCH] lxc: don't return error on GetInfo when cgroups not yet set up
serge.hallyn at canonical.com
Fri Sep 30 16:00:53 UTC 2011
Quoting Daniel P. Berrange (berrange at redhat.com):
> On Thu, Sep 29, 2011 at 10:12:17PM -0500, Serge E. Hallyn wrote:
> > Quoting Daniel P. Berrange (berrange at redhat.com):
> > > On Wed, Sep 28, 2011 at 02:14:52PM -0500, Serge E. Hallyn wrote:
> > > > Nova (openstack) calls libvirt to create a container, then
> > > > periodically checks using GetInfo to see whether the container
> > > > is up. If it does this too quickly, then libvirt returns an
> > > > error, which in libvirt.py causes an exception to be raised,
> > > > the same type as if the container was bad.
> > > lxcDomainGetInfo(), holds a mutex on 'dom' for the duration of
> > > its execution. It checks for virDomainObjIsActive() before
> > > trying to use the cgroups.
> > Yes, it does, but
> > > lxcDomainStart(), holds the mutex on 'dom' for the duration of
> > > its execution, and does not return until the container is running
> > > and cgroups are present.
> > No. It calls the lxc_controller with --background. The controller
> > main task in turn exits before the cgroups have been set up. There
> > is the race.
> The lxcDomainStart() method isn't actually waiting on the child
> pid directly, so the --background flag ought not to matter. We
> have a pipe that we pass into the controller, which we wait on
> for a notification after running the process. The controller
> does not notify the 'handshake' FD until after cgroups have
> been setup, unless I'm mis-interpreting our code
That's the call to lxcContainerWaitForContinue(), right? If so, that's
done by lxcContainerChild(), which is called by the lxc_controller.
AFAICS there is nothing in the lxc_driver which will wait on that
before dropping the driver->lock mutex.
More information about the libvir-list