[libvirt] PATCH: Ensure errors are guarenteed reported in virConnectOpen

Thu Aug 21 09:42:51 UTC 2008

On Thu, Aug 21, 2008 at 11:30:11AM +0200, Daniel Veillard wrote:
> On Thu, Aug 21, 2008 at 09:41:12AM +0100, Daniel P. Berrange wrote:
> > On Thu, Aug 21, 2008 at 08:35:25AM +0200, Daniel Veillard wrote:
> > > > To make it thread-safe we'll need to add a real virGetThreadLastError()
> > > > API, which is something on my todo list - with that new apps can just
> > > > call thevirGetThreadLastError() exclusively and never need to know the
> > > > distinction between  global/connection errors which causes so much
> > > > trouble. I'm fairly sure I can preseve existing semantics at same 
> > > > time with some suitable internal cleverness.
> > > 
> > >   I really wished we could avoid thread local storage mess, and in
> > > general anything having to do with API exported global variables. In
> > > general (I mean for the vast majority of the userland code dealing with
> > > libvirt) there is always a domain or connection object where we can plug
> > > the error, and provide it in-context. The only exception is the
> > > connection creation, maybe this means we need to provide a better entry
> > > point for this, but I would really prefer to not expose the notion of
> > > thread at the API level.
> > 
> > Well you see I want to be able to use the same virConnectPtr object
> > from multiple threads. I've looked at how todo this and it actually
> > won't be very hard once we have a way to report thread-local errors.
> 
>   not very hard ??? You expect to put a global lock in the connection
> The threading problem is a pandora box, once you provide the
> functionality people will use it in ways you didn't expect. You will see
> things like people closing the connection in one thread while another
> thread is still using it. I don't think you can catch all scenarios
> so i'm quite worried about suggesting it will be usable and then facing
> undebuggable scenarios coming from lost users.

Well, ok so let me step back a bit because there's actually several
layers to this issue...

  - The libvirtd is single threaded. This was reasonable at first, but
    some of our APIs take a long time to complete, so we need to have
    libvirtd parallelized.

  - The QEMU/LXC drivers have state stored in the daemon, so even if
    you merely allow libvirtd to parallelize by putting long running
    connections in the 'background' you ned to have locking in the
    QEMU/LXC drivers.

  - Given that we need to have locking anyway, might as well just make
    the libvirtd daemon properly multi-threaded, rather than trying to
    have a multi-request queuing system.

  - Once you've done all that, then solving the virError thread-local
    issue is all that remains to allow virConnect to be accessed by
    multiple threads

We need to do 1->3 in order to make libvirtd more scalable. At which
point 4 becomes 'not very hard' :-)

That said, there is one other option I've thought about instead of 
doing step 4, we could add a  'virConnectDup' method. This would take
an existing connection object, and create an independant copy of it
that another thread could use. The key would be that the internal
driver would automagically pass authentication credentials somehow,
so you wouldn't have to re-authenticate the 2nd connection. For the
remote driver, this could be as simple as invoking an RPC call to
fetch a one-time key, and passing that to the 2nd connection to use
as their authentication credential.

> > Re-using a single virConnectPtr object is important because when talking
> > to a remote libvirtd instance, you don't want to have to open multiple
> > connections for each thread in your app, because that will require the
> > user to re-authenticate multiple times (or for the app to store your
> > credentials - not cool)
> 
>   you keep one thread per connection. You will have to lock anyway,
> it's not like you will gain anything by trying to parallelize on a
> single connection. I really don't see what you're trying to do. Even
> entry points without side effect will need some kind of locking.

The libvirt APIs are in fact very parallelizable. There are very few times
you need to access state on the virConnectPtr object - most of the time
we're dealing with virDomainPtr objects. So you merely need to lock the
connection, while you obtain a lock on the virDomainPtr object, then 
you can release the connection lock. So the connection lock is held for
virtually no time at all. I reckon it'll parallelize very well. But its
probably not worth getting into this debate unless I've actually got
some proof of concept code to show, which is probably quite a long way
out timewise...

Daniel
-- 
|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|