[libvirt] lock manager and libguestfs use (was: Re: RFC [2/3]: The lock manager plugin driver API)

Wed Nov 10 17:59:55 UTC 2010

On Wed, Nov 10, 2010 at 05:29:28PM +0000, Richard W.M. Jones wrote:
> On Wed, Nov 10, 2010 at 04:42:01PM +0000, Daniel P. Berrange wrote:
> > On Wed, Nov 10, 2010 at 04:17:13PM +0000, Richard W.M. Jones wrote:
> > > On Wed, Nov 10, 2010 at 03:52:49PM +0000, Daniel P. Berrange wrote:
> > > > The load_drv() method there is to allow the plugin implementation
> > > > to block its use with a libvirtd that is too old. eg, say we add
> > > > a new API 'foo', and the plugin implements that API. It wants to
> > > > block its use in any libvirt which doesn't know about the 'foo'
> > > 
> > > What sort of thing could 'foo' be?
> > > 
> > > > entry point, because such use could be dangerous to safety.
> > > 
> > > You should just rename the lock manager .so file in this case.  But
> > > it'd be best to write an enduring minimum API and ABI now to avoid
> > > having to do this.
> > > 
> > > > The
> > > > load_drv() method allows that to be handled. If load_drv() fails
> > > > libvirt/libguestfs *must* refuse to run any VMs.
> > > 
> > > Well, libguestfs has other alternatives.  Ignoring the lock manager is
> > > one, allowing read-only calls to proceed is another one.
> > 
> > Allowing read-only calls to continue isn't guarenteed to be safe.
> > The lock manager plugin may wish to deny even readonly  usage, to
> > ensure that there aren't data consistency complication with NFS
> > close-to-open behaviour during migration.  If a lock manager is
> > configured for use and it fails, not running any VMs is the only
> > safe thing, until the admin has fixed the config problem.
> 
> This is entirely separate from load_drv though.  You can deny
> read-only access when I try to acquire the resource.
> 
> What I'm wondering, and you're not answering, is what future event are
> you anticipating which would make the outline API
> (http://pastebin.ca/1987261) need to be completely overhauled?  I
> can't imagine it ...

We can't predict the future, so I designed the plugin to give maximum
flexibility to adapt to what it may bring. Making one API call after
loading the plugin to check compatibility really isn't any burden
and so its a worthwhile safety net for future change.

> > > There are still issues with this.  Multiple threads can call
> > > guestfs_launch (on different handles), resulting in reentrant calls to
> > > load_drv.  I'm not convinced that the API as proposed is safe against
> > > this sort of thing:
> > > 
> > > http://lists.fedoraproject.org/pipermail/devel/2010-January/129117.html
> > 
> > As mentioned below, these APIs are intended for doing global
> are [not]?
> > state management, instead they are for version/feature
> > compatibility checks.
> 
> But back to my original point, if these fail I'm left SOL.  This would
> prevent libguestfs from working under nearly every condition, and all
> I could suggest to users would be that something needs to be upgraded.

You wouldn't expect this to ever fail in normal circumstances. RPM
dependancies should ensure it doesn't get into a broken state. This
is there as a safety net in case the admin does something stupid,
so that we don't pose any risk to the safety of the disk images.

> A basic minimum always-supported API is a must here, unless there is
> genuinely some system out there that has such bizarre locking that we
> would need to change how this works completely, and if there is such a
> system perhaps we should consider it in the API now.

This isn't about completely changing the locking scheme, it is about
allowing extra APIs to be added to provide additional information.
If a plugin wants to be able to rely on that additional info being
present, it should be able to check that this is supported by the
libvirtd instance upfront. This avoids a runtime failure at time 
of guest startup, disk hotplug, or worse, at guest migration where
it is possibly already too late to cope. 

> > It isn't about the locking preventing unload. There are a variety
> > of scenarios in which is it unsafe to unload *any* dlopen'd 
> > library. eg, if the module has set a thread local variable with
> > a destructor, it is not safe to unload the module unless all
> > threads have terminated, or all thread locals cleaned up. The
> > module has no way to ensure this, so the only safe option is to
> > never dlclose() it, otherwise you'll get SIGBUS or SIGSEGV.
> 
> This isn't true.  An API which allowed futures should also allow those
> futures to be cancelled (from either the caller or callee side), and
> it should cancel them in the __attribute__((destructor)).  In any
> case, the API proposed doesn't seem to have this.

Unfortunately there are plenty of existing APIs out there which don't
provide a way to cancel future callbacks. eg pthread_key_create has
no way to guarantee that the destructor won't ever be called again,
unless you can kill off all threads before unloading the module.
Another example is the GObject type system which does not allow you
to unregister modules. Thus to be safe, we need to allow plugins to
indicate that they don't wish to be unloaded from memory.

Regards,
Daniel
-- 
|: Red Hat, Engineering, London    -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org        -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|