[libvirt] Internal error message from netcf when trying to start libvirtd

Daniel P. Berrange berrange at redhat.com
Mon Dec 2 17:24:17 UTC 2013


On Mon, Dec 02, 2013 at 06:17:00PM +0100, Michal Privoznik wrote:
> On 17.10.2013 14:40, Daniel P. Berrange wrote:
> > On Tue, Oct 15, 2013 at 06:03:27PM +0200, Christophe Fergeau wrote:
> >> Hey,
> >>
> >> Due to a configuration issue on my system, libvirtd is not starting on my
> >> system (not complaining about this!):
> >> 2013-10-15 15:40:51.024+0000: 10222: info : libvirt version: 1.1.3
> >> 2013-10-15 15:40:51.024+0000: 10222: error : virNetTLSContextCheckCertFile:117 : Cannot read CA certificate
> >> '/home/teuf/usr/etc/pki/CA/cacert.pem': No such file or directory
> >>
> >> However, before libvirtd exits, I get another error message from the netcf
> >> code, which is unexpected this time:
> >> 2013-10-15 15:49:18.361+0000: 10222: error : netcfStateCleanup:105 :
> >> internal error: Attempt to close netcf state driver already closed
> >>
> >> This message comes from the call of virStateCleanup() at the end of main()
> >> in libvirtd.c. virStateCleanup() should not be called before
> >> daemonStateInit() has been called in main.
> >> After this call, things get more ugly as daemonStateInit() calls
> >> virStateInitialize() from a thread, so there's probably a small window for
> >> virStateInitialize() and virStateCleanup() running concurrently if an error
> >> occurs between the call to daemonStateInit() and the call to
> >> virNetServerRun().
> >>
> >> I'm sending this email rather than a patch as I'm not sure what is the best
> >> way to fix it. The easy way would be for virStateCleanup() to be a noop
> >> when virStateInitialize() hasn't been called (iow remove the error message
> >> from netcfStateCleanup). However, this would leave this small race
> >> condition around (which is not that bad as it would only occurs in
> >> situations when the daemon fails to start). So another approach would be to
> >> set a vir_state_initialized boolean once the thread has called
> >> ivrStateInitialize, and only call virStateCleanup() when it's set.
> >>
> >> Or maybe there's a 3rd way to fix this?
> >>
> >> Let me know if you have any guidance into the best way to fix this,
> > 
> > Yeah, I've known about this issue for a while and its a bit horrible
> > to solve. I think we probably need to have a global mutex to serialize
> > the virStateInitialize, virStateReload and virStateCleanup calls so
> > that none of them can run in parallel.
> > 
> > I'd have virGlobalInit create the mutex instance, since we know that
> > is run from thread safe context.
> > 
> > Then make the virState* calls hold that mutex while executing.
> > 
> > We don't want virStateCleanup to skip execution if virStateInitialize
> > has failed though - every callback in virStateCleanup should be written
> > to be safe if its corresponding init function hasn't run. Just the
> > mutex serialization ought to be sufficient I think.
> > 
> > 
> > Daniel
> > 
> 
> Sorry for bringing up a dead thread, but as you both may have noticed I
> was trying to solve this too:
> 
> https://www.redhat.com/archives/libvir-list/2013-November/msg01311.html
> 
> And I don't think Dan, your approach is gonna work. I mean, the reason
> for running virStateInitialize in thread is - it may take ages to

Actually no, the reason for using a thread is that the initialization
code needs to have a working event loop - so you cannot run the
virStateInitialize code from the same thread that is running the
event loop.

You're probably thinking of the QEMU VM autostart code, whicih spawns a
separate thread to auto-start QEMU VMs to avoid delaying use of the QEMU
driver while (slow) autostart takes place.

> complete. Hence, serializing virState* will lead to main thread waiting
> for the virStateInitialize to complete (which may not happen, i.e. in
> case of deadlock somewhere in a driver).
> 
> But I was thinking of rather refined locking. Turning each of
> vir*DriverTab into virObjectLockable(). Then, each access would require
> object to lock itself. Moreover, the virStateCleanup would unregister
> the driver. Unfortunately, it could only unregister stateful drivers.

This race condition scenario only hits us in two cases

 - libvirtd startup is being aborted due to some failure
   while StateInit is still running
 - libvirtd reload is triggered by SIGHUP causing StateReload
   to run

In both of those cases I think it is absolutely desirable that
virStateReload and virStateCleanup both block until virStateInitialize
completes.

The question of virStateInitialize deadlock has no bearing on this.
Any such problems are simply a bug that we must fix. So I don't see
why we shouldn't have a global mutex serializing the three
virState{Init,Reload,Cleanup} APIs.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list