[libvirt] QEMU driver thread safety rules

Fri Oct 30 12:46:25 UTC 2009

On Fri, Oct 30, 2009 at 11:05:03AM +0000, Daniel P. Berrange wrote:
> On Thu, Oct 29, 2009 at 06:04:29PM +0100, Daniel Veillard wrote:
> > > All other API calls making changes get safely queued up at
> > > step 1, but  API calls which simply wish to query information
> > > can run without being blocked at all. This fixes the major
> > > concurrency problem with running monitor commands. The use
> > > of a condition variable at the start of step 1, also allows
> > > us to time out API calls, if some other thread get stuck in
> > > the monitor for too long.  I think this also makes the use of
> > > a RWLock on the QEMU driver unneccessary, since no code will
> > > ever be holding a mutex in any place that sleeps/wait. Only
> > > the condition variable will be held during sleeps/waits.
> > > 
> > > Since we'll now effectively have 3 locks, and 1 condition
> > > variable this is getting kind of complex. So the rest of this
> > > mail is a file I propose to put in src/qemu/THREADS.txt
> > > describing what is going on, and showing the recommended 
> > > design patterns to use.
> > 
> >   I have just one remark, this separation between APIs might
> > be done one level up, i.e. at the library entry point level
> > we should know what may induce a state change and those could
> > be flagged more formally. This may help other drivers where
> > libvirt needs to keep the state instead of asking the hypervisor.
> 
> It can't be done at the library entry level, since the locking needs
> to be done against objects that are private to the driver.

  hum, I'm not suggesting to do the locking one level up, but
to flag in some way entry points which may induce a state change.

> > > Basic locking primitives
> > > ------------------------
> > > 
> > > There are a number of locks on various objects
> > > 
> > >   * struct qemud_driver: RWLock
> 
> Opps, this should have said 'Mutex' rather than RWLock 

  Ah right, since you said you dropped the idea I was a bit surprized...

> > > 
> > >     This is the top level lock on the entire driver. Every API call in
> > >     the QEMU driver is blocked while this is held, though some internal
> > >     callbacks may still run asynchronously. This lock must never be held
> > >     for anything which sleeps/waits (ie monitor commands)
> > > 
> > >     When obtaining the driver lock, under *NO* circumstances must
> > >     any lock be held on a virDomainObjPtr. This *WILL* result in
> > >     deadlock.
> > 
> >   Any chance to enforce that at the code level ? Since we have
> > primitives for both, we could once the RW lock is taken set a flag in
> > the driver, and the DomainObj locking/unlocking routine could raise an
> > error if this happen.
> 
> That is not possible todo safely. If you add a flag in the driver to
> indicate whether it is locked or not, then you need to add another
> mutex to protect reads/write to that flag, otherwise you've got a 
> clear race condition in checking it.

  Well you can't protect at 100% but that state change could be modified
after having taken the lock and just before releasing it. It's not a
protection against reentrancy, it's about detecting a problem at
runtime. You may not be able to detect it for a new nanosecs after
having taken the lock or before releasing it, but that will allow better
runtime error reporting in general than just a hung driver which was the
outcome when we introduced locking and chased locking.
  The goal is to be able to log (in general but not a garanteed 100%)
when we are doing something which may lead to a lock, report this in the
log files and allow users to send them.

> > >   * qemuMonitorPtr:  Mutex
> > > 
> > >     Lock to be used when invoking any monitor command to ensure safety
> > >     wrt any asynchronous events that may be dispatched from the monitor.
> > >     It should be acquired before running a command.
> > > 
> > >     The job condition *MUST* be held before acquiring the monitor lock
> > > 
> > >     The virDomainObjPtr lock *MUST* be held before acquiring the monitor
> > >     lock.
> > > 
> > >     The virDomainObjPtr lock *MUST* then be released when invoking the
> > >     monitor command.
> > > 
> > >     The driver lock *MUST* be released when invoking the monitor commands.
> > > 
> > >     This ensures that the virDomainObjPtr & driver are both unlocked while
> > >     sleeping/waiting for the monitor response.
> > 
> >   I had to read this twice and I'm not sure I managed to fully map
> > mentally the full set of constraints. 
> 
> Essentially there's a hierarchy of objects
> 
>    Driver -> virDomainObjPtr  -> qemuMonitorPtr
> 
> You have to acquire the locks in that order, and once you've acquired
> the final qemuMonitorPtr lock, you must release the other locks
> before running the actual monitor command.

  okay

> >   It would be good if a maximum number of the constraints lested above
> >   could also be checked at runtime. Sure we could try to make new
> >   checking rules like we did for previous locking checks but it's hard
> >   for someone doing a patch to really run those. And I doubt the extra
> >   burden of checking a few conditions in locking routines would really
> >   impact performances. The only problem might be availbaility of
> >   pointers at the locking routines (or wrappers) to get the
> >   informations.
> 
> As before it is not possible to check those constraints safely at
> runtime without adding yet more locks. The idea of adding these methods 
> qemuDomainObjBeginJob, qemuDomainObjEndJob, qemuDomainObjEnterMonitor
> and qemuDomainObjExitMonitor, is that they take the complexity out of
> the code. By defining the common code patterns, and making everything
> use these helpers instead of the locks themselves, we ensure that all
> code is compliant with the rules. It has taken that complex set of
> ordering rules and simplified it to one of the patterns shown below

  A good point, sure ! But a small extra step would allow to debug this
more easilly, unless you;re sure we really won't hit problems after the
refactoring.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel at veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/