[libvirt] [PATCHv3 6/6] rpc: Fix connection close callback race condition and memory corruption/crash
Eric Blake
eblake at redhat.com
Tue Apr 2 16:20:20 UTC 2013
On 03/31/2013 10:20 AM, Peter Krempa wrote:
> The last Viktor's effort to fix the race and memory corruption unfortunately
> wasn't complete in the case the close callback was not registered in an
> connection. At that time, the trail of event's that I'll describe later could
> still happend and corrupt the memory or cause a crash of the client (including
s/happend/happen/
> the daemon in case of a p2p migration).
>
> Consider the following prerequisities and trail of events:
> Let's have a remote connection to a hypervisor that doesn't have a close
> callback registered and the client is using the event loop. The crash happens in
> cooperation of 2 threads. Thread E is the event loop and thread W is the worker
> that does some stuff. R denotes the remote client.
>
> 1.) W - The client finishes everything and sheds the last reference on the client
> 2.) W - The virObject stuff invokes virConnectDispose that invokes doRemoteClose
> 3.) W - the remote close method invokes the REMOTE_PROC_CLOSE RPC method.
> 4.) W - The thread is preempted at this point.
> 5.) R - The remote side recieves the close and closes the socket.
s/recieves/receives/
> 6.) E - poll() wakes up due to the closed socket and invokes the close callback
> 7.) E - The event loop is preempted right before remoteClientCloseFunc is called
> 8.) W - The worker now finishes, and frees the conn object.
> 9.) E - The remoteClientCloseFunc accesses the now-freed conn object in the
> attempt to retrieve pointer for the real close callback.
> 10.) Kaboom, corrupted memory/segfault.
>
> This patch tries to fix this by introducing a new object that survives the
> freeing of the connection object. We can't increase the reference count on the
> connection object itself as the connection would never be closed as the
s/itself as/itself or/; s/closed as/closed, as/
> connection is closed only when the reference count reaches zero.
>
> The new object - virConnectCloseCallbackData - is a lockable object that keeps
> the pointers to the real user registered callback and ensures that the
> connection callback is either not called if the connection was already freed or
> that the connection isn't freed while this is being called.
> ---
> src/datatypes.c | 55 ++++++++++++++++++++++++++++++++++++--------
> src/datatypes.h | 22 ++++++++++++++----
> src/libvirt.c | 29 ++++++++++++-----------
> src/remote/remote_driver.c | 57 +++++++++++++++++++++++++++-------------------
> 4 files changed, 112 insertions(+), 51 deletions(-)
So far, this is just a code review. I'm also planning on testing the
patch, and will report back with results later in the day.
> @@ -63,14 +65,19 @@ static void virStoragePoolDispose(void *obj);
> static int
> virDataTypesOnceInit(void)
> {
> -#define DECLARE_CLASS(basename) \
> - if (!(basename ## Class = virClassNew(virClassForObject(), \
> +#define DECLARE_CLASS_COMMON(basename, parent) \
> + if (!(basename ## Class = virClassNew(parent, \
> #basename, \
> sizeof(basename), \
> basename ## Dispose))) \
> return -1;
> +#define DECLARE_CLASS(basename) \
> + DECLARE_CLASS_COMMON(basename, virClassForObject())
> +#define DECLARE_CLASS_LOCKABLE(basename) \
> + DECLARE_CLASS_COMMON(basename, virClassForObjectLockable())
Wonder if these definitions are useful enough to rename to VIR_DECLARE_*
and move into virobject.h. But that would be a separate patch, to
adjust all clients that would benefit from it.
> +++ b/src/datatypes.h
> @@ -93,6 +93,22 @@ extern virClassPtr virStoragePoolClass;
> # define VIR_IS_DOMAIN_SNAPSHOT(obj) \
> (VIR_IS_SNAPSHOT(obj) && VIR_IS_DOMAIN((obj)->domain))
>
> +
> +typedef struct _virConnectCloseCallbackData virConnectCloseCallbackData;
> +typedef virConnectCloseCallbackData *virConnectCloseCallbackDataPtr;
> +
> +/**
> + * Internal structure holding data related to connection close callbacks.
> + */
> +struct _virConnectCloseCallbackData {
> + virObjectLockable parent;
> +
> + virConnectPtr conn;
> + virConnectCloseFunc callback;
> + void *opaque;
> + virFreeCallback freeCallback;
> +};
Would it make sense to move this struct definition to be local to
datatypes.c, and have all other uses of it treat it as opaque?...
> +++ b/src/libvirt.c
> @@ -20189,24 +20189,27 @@ int virConnectRegisterCloseCallback(virConnectPtr conn,
> virObjectRef(conn);
>
> virMutexLock(&conn->lock);
> + virObjectLock(conn->closeCallback);
>
> virCheckNonNullArgGoto(cb, error);
>
> - if (conn->closeCallback) {
> + if (conn->closeCallback->callback) {
...But then you would need to expose an accessor function instead of
directly accessing closeCallback->callback. Okay, probably fine as is.
> +++ b/src/remote/remote_driver.c
> @@ -337,32 +337,38 @@ enum virDrvOpenRemoteFlags {
> VIR_DRV_OPEN_REMOTE_AUTOSTART = (1 << 2), /* Autostart a per-user daemon */
> };
>
> +static void
> +remoteClientCloseFreeFunc(void *opaque)
> +{
> + virConnectCloseCallbackDataPtr cbdata = opaque;
> +
> + virObjectUnref(cbdata);
> +}
You shouldn't need this; just use virObjectFreeCallback instead.
> +remoteClientCloseFunc(virNetClientPtr client ATTRIBUTE_UNUSED,
> + int reason,
> + void *opaque)
> {
> - virConnectPtr conn = opaque;
> + virConnectCloseCallbackDataPtr cbdata = opaque;
>
> - virMutexLock(&conn->lock);
> - if (conn->closeCallback) {
> - virConnectCloseFunc closeCallback = conn->closeCallback;
> - void *closeOpaque = conn->closeOpaque;
> - virFreeCallback closeFreeCallback = conn->closeFreeCallback;
> - unsigned closeUnregisterCount = conn->closeUnregisterCount;
> + virObjectLock(cbdata);
>
> - VIR_DEBUG("Triggering connection close callback %p reason=%d",
> - conn->closeCallback, reason);
> - conn->closeDispatch = true;
> - virMutexUnlock(&conn->lock);
> - closeCallback(conn, reason, closeOpaque);
> - virMutexLock(&conn->lock);
> - conn->closeDispatch = false;
> - if (conn->closeUnregisterCount != closeUnregisterCount &&
> - closeFreeCallback)
> - closeFreeCallback(closeOpaque);
> + if (cbdata->callback) {
> + VIR_DEBUG("Triggering connection close callback %p reason=%d, opaque=%p",
> + cbdata->callback, reason, cbdata->opaque);
> + cbdata->callback(cbdata->conn, reason, cbdata->opaque);
> +
> + if (cbdata->freeCallback)
> + cbdata->freeCallback(cbdata->opaque);
> + cbdata->callback = NULL;
> + cbdata->freeCallback = NULL;
> }
> - virMutexUnlock(&conn->lock);
> + virObjectUnlock(cbdata);
> +
> + /* free the connection reference that comes along with the callback
> + * registration */
> + virObjectUnref(cbdata->conn);
> }
Took me a couple reads, but it looks right. The old code had to
temporarily drop connection lock while using the callback; the new code
never even grabs connection lock because all the callback data is
encapsulated in a separate object.
>
> /* helper macro to ease extraction of arguments from the URI */
> @@ -765,9 +771,11 @@ doRemoteOpen(virConnectPtr conn,
> goto failed;
> }
>
> + virObjectRef(conn->closeCallback);
> +
> virNetClientSetCloseCallback(priv->client,
> remoteClientCloseFunc,
> - conn, NULL);
> + conn->closeCallback, remoteClientCloseFreeFunc);
Here's where virObjectFreeCallback comes in handy.
On the surface, the code seems reasonable. If my testing passes, and
you fix up the typos and the unneeded helper function, then I don't mind
giving:
ACK.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 621 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20130402/fea0837d/attachment-0001.sig>
More information about the libvir-list
mailing list