[libvirt] [RFC PATCHv2 0/2] Implement tiered driver loading

Adam Walters adam at pandorasboxen.com
Tue Jan 28 16:16:12 UTC 2014


On Tue, Jan 28, 2014 at 8:05 AM, Daniel P. Berrange <berrange at redhat.com>wrote:

> On Thu, Jan 23, 2014 at 03:06:16PM -0500, Adam Walters wrote:
> > This patchset implements a tiered driver loading system. I split the
> hypervisor
> > drivers out into their own tier, which is loaded after the other
> drivers. This
> > has the net effect of ensuring that things like secrets, networks, etc.,
> are
> > initialized and auto-started before any hypervisors, such as QEMU, LXC,
> etc.
> > This resolves the race condition currently present when starting libvirtd
> > while domains are running, which happens when restarting libvirtd after
> having
> > started at least one domain.
>
> [snip]
>
> > For anyone who is not familiar with the race condition I mention above,
> the
> > basic description is that upon restarting libvirtd, any running QEMU
> domains
> > using storage volumes are killed randomly due to their associated
> storage pool
> > not yet being online. This is due to storage pool auto-start not having
> > completed prior to QEMU initialization. In my prior testing, I found
> that this
> > race condition affected at least one domain approximately 40% of the
> time. I
> > sent this information to the mailing list back on 06DEC2013, if anyone is
> > interested in going back and re-reading my description.
>
> Ok, so the current sequence is /supposed/ to be
>
>   1 StateInitialize
>        1.1 StorageInitialize
>             1.1.1 Load all configs
>             1.1.2 Detect currently active pools
>        1.2 QEMUInitialize
>             1.2.1 Load all configs
>             1.2.2 Detect currently active guests
>   2 StateAutostart
>        2.1 StorageAutostart
>             2.1.1 Start any inactive pools marked autostart
>        2.2 QEMUAutostart
>             2.2.1 Start any inactive guests marked autostart
>
>
> Looking at the storage driver code though, I see a flaw in that
> 1.1.2 does not actually exist. The checking for existing active
> storage pools is only done in step 2.1.1 as part of autostart,
> which as you say is clearly too late.
>
> Also the checking for existing active storage pools looks like
> it only works for storage pools which actually have some persistent
> kernel / userspace state outside of libvirt. ie iSCSI pools will
> remain active even when libvirt is not running, since the iSCSI
> initiator is outside scope of libvirt.  If we have a RBD pool,
> however, which is using librbd instead of the FUSE driver there's
> no persistent state to detect. The issue here is that the storage
> driver is not correctly keeping track of what storage pools were
> active prior to restart. ie any active storage pools before restart,
> should still be active after restart, *regardless* of the autostart
> flag.
>
> So I understand why your proposed split of driver loading works
> to solve this, but even with this it only solves the startup
> ordering problem. There's still the issue that we're reliant on
> the autostart flag being set. If there's a guest with an RBD
> volume against a pool that isn't set to autostart, we're still
> doomed.
>

I hadn't really thought about that piece, as I always set my storage pools
to auto-start. I'm still not sure that a race condition would not still
exist if
the storage driver continues to utilize a hard-coded connection to
'qemu:///',
though. I'm not particularly familiar enough with the libvirt code to know
exactly at which point the QEMU driver can accept a connection to its URI
and facilitate access to information about networks, secrets, etc.

Also note that I was also able to reproduce the race condition with a
file-backed
storage pool, too. So even taking out the semi-exotic RBD piece, I think
there
would still be problems if 1.1.2 were added into the code. I could
certainly look
into implementing something to store state information for storage pools.
I'll likely
look toward existing domain code for an idea of how to implement it.

That issue does sort of bleed into my other patchset that implements the
'config:///' URI, which we had some discussion over prior to the fixup and
resubmit.


>
> The storage drivers are just plain broken wrt libvirt restarts
> because they're not correctly recording their state and restoring
> back to the exact same state after restart.
>
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/:|
> |: http://libvirt.org              -o-             http://virt-manager.org:|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/:|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc:|
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20140128/9b35d44d/attachment-0001.htm>


More information about the libvir-list mailing list