[libvirt] [PATCH v2] virObject: Error on suspicious ref and unref

Michal Privoznik mprivozn at redhat.com
Fri Nov 29 12:59:14 UTC 2013


On 29.11.2013 08:18, Michal Privoznik wrote:
> https://bugzilla.redhat.com/show_bug.cgi?id=1033061
> 
> Since our transformation into virObject is not complete and we must do
> ref and unref ourselves there's a chance that we will get it wrong. That
> is, while one thread is doing unref and subsequent dispose another
> thread may come and do the ref & unref on stale pointer. This results in
> dispose being called twice (and possibly simultaneously). These kind of
> errors are hard to catch so we should at least throw an error into logs
> if such situation occurs. In fact, I've seen a stack trace showing this
> error had happen (obj = 0x7f4968018260):

On a second thought I don't think this patch is that good. I mean, the
libvirtd has a very small window where this patch would work. The
beginning of the window is bounded by destroy callback where memory
allocated for an object is free()d, the end of the window is actual
unmap performed by glibc. Because after this point, accessing a stale
pointer either:

a) results in access into unmapped memory and thus SIGSEGV

b) results in access into mapped - but random memory, where a random
value is incremented or decremented and hence our check for refcount
being smaller than or equal to one  is bogus.

So I think I have to self-NAK this one. Sigh.

Anyway, just for the record, the original bug is (MT = MainThread -
thread running main(); IT = InitializeThread)

1) (MT) daemonStateInit spawns a new thread (IT) to initialize all the
drivers, which subsequently autostart domains, ...

2) (IT) creates a new driver - be it netcf driver in this case.
driver.refs = 1

3) (MT) For some reason, we exit the eventloop early (e.g. SIGINT was
delivered) resulting in calling virStateCleanup() which iterates over
table of drivers and calls ->stateCleanup() method over each one. In our
specific case, the netfc driver calls virObjectUnref(driver),
driver.refs = 0 and hence the dispose cb is called.

4) (IT) Doesn't know anything about quiting, and tries to autostart
domains. Be it LXC domains for now. So it opens a new dummy connection,
which causes virObjectRef(driver). But wait! The driver is already bing
disposed.

5) (IT) Eventually calls virConnectClose() which unrefs the driver,
again, resulting in disposing the driver.

Therefore I think the correct way how to solve this is to remove driver
from global driver table while iterating over its items in
virStateCleanup().

Michal




More information about the libvir-list mailing list