[libvirt] [PATCH v4 4/4] gluster: cache glfs connection object per volume

Prasanna Kalever pkalever at redhat.com
Mon Dec 12 14:32:14 UTC 2016


On Wed, Dec 7, 2016 at 10:38 PM, Peter Krempa <pkrempa at redhat.com> wrote:
> On Tue, Dec 06, 2016 at 22:52:01 +0530, Prasanna Kumar Kalever wrote:
>> Currently, in case if we have 4 extra attached disks, then for each disk
>> we need to call 'glfs_init' (over network) and friends which could be costly.
>>
>> Additionally snapshot(external) scenario will further complex the situation.
>
> This part was extracted to a separate patch already, thus this doesn't
> have to deal with snapshots in any way than with other gluster
> connections.

Yeah!

>
>> This patch maintain a cache of glfs objects per volume, hence the
>> all disks/snapshots belonging to the volume whose glfs object is cached
>> can avoid calls to 'glfs_init' and friends by simply taking a ref on
>> existing object.
>>
>> The glfs object is shared only if volume name and all the volfile servers match
>> (includes hostname, transport and port number).
>>
>> Thanks to 'Peter Krempa' for all the inputs.
>>
>> Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever at redhat.com>
>> ---
>>  src/storage/storage_backend_gluster.c | 279 +++++++++++++++++++++++++++++++---
>>  1 file changed, 254 insertions(+), 25 deletions(-)
>>
>> diff --git a/src/storage/storage_backend_gluster.c b/src/storage/storage_backend_gluster.c
>> index e0841ca..9f633ac 100644
>> --- a/src/storage/storage_backend_gluster.c
>> +++ b/src/storage/storage_backend_gluster.c
>> @@ -36,6 +36,20 @@
>>
>>  VIR_LOG_INIT("storage.storage_backend_gluster");
>>
>> +
>> +typedef struct _virStorageBackendGlusterState virStorageBackendGlusterState;
>> +typedef virStorageBackendGlusterState *virStorageBackendGlusterStatePtr;
>> +
>> +typedef struct _virStorageFileBackendGlusterPriv virStorageFileBackendGlusterPriv;
>> +typedef virStorageFileBackendGlusterPriv *virStorageFileBackendGlusterPrivPtr;
>> +
>> +typedef struct _virStorageBackendGlusterCache virStorageBackendGlusterCache;
>> +typedef virStorageBackendGlusterCache *virStorageBackendGlusterCachePtr;
>> +
>> +typedef struct _virStorageBackendGlusterCacheConn virStorageBackendGlusterCacheConn;
>> +typedef virStorageBackendGlusterCacheConn *virStorageBackendGlusterCacheConnPtr;
>> +
>> +
>>  struct _virStorageBackendGlusterState {
>>      glfs_t *vol;
>>
>> @@ -47,8 +61,205 @@ struct _virStorageBackendGlusterState {
>>      char *dir; /* dir from URI, or "/"; always starts and ends in '/' */
>>  };
>>
>> -typedef struct _virStorageBackendGlusterState virStorageBackendGlusterState;
>> -typedef virStorageBackendGlusterState *virStorageBackendGlusterStatePtr;
>> +struct _virStorageFileBackendGlusterPriv {
>> +    glfs_t *vol;
>
> You should also keep a reference to the object in the cache, so that you
> can return it and don't have to look it up again.
>
>> +    char *canonpath;
>> +};
>> +
>> +struct _virStorageBackendGlusterCacheConn {
>> +    virObjectLockable parent;
>> +    virStorageFileBackendGlusterPrivPtr priv;
>
> You don't really need the complete object here. you only need the glfs_t
> part here. The 'cannonpath' variable is different for possible storage
> source objects.

got it.

>
>> +    size_t nhosts;
>> +    virStorageNetHostDefPtr hosts;
>> +    char *volname;
>> +};
>> +
>> +struct _virStorageBackendGlusterCache {
>> +    virMutex lock;
>> +    size_t nConn;
>> +    virStorageBackendGlusterCacheConnPtr *conn;
>> +};
>> +
>> +
>> +static virStorageBackendGlusterCachePtr virStorageBackendGlusterGlobalCacheObj;
>> +static virClassPtr virStorageBackendGlusterCacheConnClass;
>> +static virOnceControl virStorageBackendGlusterCacheOnce = VIR_ONCE_CONTROL_INITIALIZER;
>> +static bool virStorageBackendGlusterCacheInitGlobalError;
>> +static bool virStorageBackendGlusterCacheConnCleaned;
>
> As said. Don't use global flags. You should be able to avoid both of
> them.
>
> Okay. I've seen below. They not only are avoidable, but your usage is
> outright dangerous.

Okay.

>
>> +
>> +
>> +static void virStorageBackendGlusterCacheConnDispose(void *obj);
>> +
>> +
>> +static int virStorageBackendGlusterCacheConnOnceInit(void)
>> +{
>> +    if (!(virStorageBackendGlusterCacheConnClass =
>> +                                virClassNew(virClassForObjectLockable(),
>> +                                "virStorageBackendGlusterCacheConnClass",
>> +                                sizeof(virStorageBackendGlusterCacheConn),
>> +                                virStorageBackendGlusterCacheConnDispose)))
>
> I've noticed that you actually create a lockable object here, but don't
> use the locks. This is currently okay, since it's write only. The lock
> might come handy though ...
>
>> +        return -1;
>> +
>> +    return 0;
>> +}
>> +
>> +static void virStorageBackendGlusterCacheConnDispose(void *object)
>> +{
>> +    virStorageBackendGlusterCacheConnPtr conn = object;
>> +
>> +    glfs_fini(conn->priv->vol);
>> +
>> +    VIR_FREE(conn->priv->canonpath);
>> +    VIR_FREE(conn->priv);
>> +    VIR_FREE(conn->volname);
>> +
>> +    virStorageNetHostDefFree(conn->nhosts, conn->hosts);
>> +
>> +    /* all set to delete the connection from cache */
>> +    virStorageBackendGlusterCacheConnCleaned = true;
>> +}
>> +
>> +VIR_ONCE_GLOBAL_INIT(virStorageBackendGlusterCacheConn);
>> +
>> +static bool
>> +virStorageBackendGlusterCompareHosts(size_t nSrcHosts,
>> +                                     virStorageNetHostDefPtr srcHosts,
>> +                                     size_t nDstHosts,
>> +                                     virStorageNetHostDefPtr dstHosts)
>> +{
>> +    bool match = false;
>> +    size_t i;
>> +    size_t j;
>> +
>> +    if (nSrcHosts != nDstHosts)
>> +        return false;
>> +    for (i = 0; i < nSrcHosts; i++) {
>> +        for (j = 0; j < nDstHosts; j++) {
>> +            if (srcHosts[i].type != dstHosts[j].type)
>> +                continue;
>> +            switch (srcHosts[i].type) {
>> +                case VIR_STORAGE_NET_HOST_TRANS_UNIX:
>> +                    if (STREQ(srcHosts[i].u.uds.path, dstHosts[j].u.uds.path))
>> +                        match = true;
>> +                    break;
>> +                case VIR_STORAGE_NET_HOST_TRANS_TCP:
>> +                case VIR_STORAGE_NET_HOST_TRANS_RDMA:
>> +                    if (STREQ(srcHosts[i].u.inet.addr, dstHosts[j].u.inet.addr)
>> +                                               &&
>> +                        STREQ(srcHosts[i].u.inet.port, dstHosts[j].u.inet.port))
>> +                        match = true;
>> +                    break;
>> +                case VIR_STORAGE_NET_HOST_TRANS_LAST:
>> +                    break;
>> +            }
>> +            if (match)
>> +                break; /* out of inner for loop */
>> +        }
>> +        if (!match)
>> +            return false;
>> +        match = false; /* reset */
>> +    }
>> +    return true;
>> +}
>> +
>> +static int
>> +virStorageBackendGlusterCacheAdd(virStorageSourcePtr src,
>> +                                 virStorageFileBackendGlusterPrivPtr priv)
>> +{
>> +    virStorageBackendGlusterCacheConnPtr newConn = NULL;
>> +    virStorageBackendGlusterCacheConnPtr conn = NULL;
>> +    size_t i;
>> +
>> +    for (i = 0; i < virStorageBackendGlusterGlobalCacheObj->nConn; i++) {
>
> Umm ... you iterate the cache ...

This is the problem with changing the code in the last minute.
I have edited something after I did git format-patch.
Will address them.

>
>> +        conn = virStorageBackendGlusterGlobalCacheObj->conn[i];
>> +        if (STRNEQ(conn->volname, src->volume))
>> +            continue;
>> +        if (virStorageBackendGlusterCompareHosts(conn->nhosts, conn->hosts,
>> +                                                 src->nhosts, src->hosts)) {
>> +            return 0;
>> +        }
>> +    }
>> +
>> +    if (virStorageBackendGlusterCacheConnInitialize() < 0)
>> +        return -1;
>
> ... before attempting to initialize the global cache?!? That doesn't
> make sense. Also this call is not needed here, since the only entrypoint
> is virStorageFileBackendGlusterInit which initializes it.
>
>> +
>> +    if (!(newConn = virObjectLockableNew(virStorageBackendGlusterCacheConnClass)))
>> +        return -1;
>> +
>> +    if (VIR_STRDUP(newConn->volname, src->volume) < 0)
>> +        goto error;
>> +
>> +    newConn->nhosts = src->nhosts;
>> +    if (!(newConn->hosts = virStorageNetHostDefCopy(src->nhosts, src->hosts)))
>> +        goto error;
>> +    newConn->priv = priv;
>> +
>> +    if (VIR_APPEND_ELEMENT(virStorageBackendGlusterGlobalCacheObj->conn,
>> +                           virStorageBackendGlusterGlobalCacheObj->nConn,
>> +                           newConn) < 0)
>> +        goto error;
>> +
>> +    return 0;
>> +
>> + error:
>> +    virStorageNetHostDefFree(newConn->nhosts, newConn->hosts);
>> +    VIR_FREE(newConn->volname);
>
> You leak newConn in some cases.

Unref!

>
>> +    return -1;
>> +}
>> +
>> +static glfs_t *
>> +virStorageBackendGlusterCacheQuery(virStorageSourcePtr src)
>
> We usually use "Lookup".

good one. My naming skills are pathetic. Mercy me ...

>
>> +{
>> +    virStorageBackendGlusterCacheConnPtr conn;
>> +    size_t i;
>> +
>> +    if (virStorageBackendGlusterGlobalCacheObj == NULL)
>> +        return NULL;
>> +
>> +    for (i = 0; i < virStorageBackendGlusterGlobalCacheObj->nConn; i++) {
>> +        conn = virStorageBackendGlusterGlobalCacheObj->conn[i];
>> +        if (STRNEQ(conn->volname, src->volume))
>> +            continue;
>> +        if (virStorageBackendGlusterCompareHosts(conn->nhosts, conn->hosts,
>> +                                                 src->nhosts, src->hosts)) {
>> +            virObjectRef(conn);
>> +            return conn->priv->vol;
>> +        }
>> +    }
>> +
>> +    return NULL;
>> +}
>> +
>> +static void
>> +virStorageBackendGlusterCacheRefresh(virStorageSourcePtr src)
>
> This name is weird. Did you mean "Release"?

As you say.

>
>> +{
>> +    virStorageFileBackendGlusterPrivPtr priv = src->drv->priv;
>> +    virStorageBackendGlusterCacheConnPtr conn = NULL;
>> +    size_t i;
>> +
>> +    virMutexLock(&virStorageBackendGlusterGlobalCacheObj->lock);
>> +    for (i = 0; i < virStorageBackendGlusterGlobalCacheObj->nConn; i++) {
>> +        conn = virStorageBackendGlusterGlobalCacheObj->conn[i];
>> +        if (STRNEQ(conn->volname, src->volume))
>> +            continue;
>> +        if (virStorageBackendGlusterCompareHosts(conn->nhosts, conn->hosts,
>> +                                                 src->nhosts, src->hosts)) {
>> +            virObjectUnref(conn);
>> +            if (!virStorageBackendGlusterCacheConnCleaned) {
>
> Umm, nope, that's not how it works. This is totaly thread unsafe. Other
> thread can modify your global flag.
>
> You probably did not check how virObjectUnref actually works. The return
> value notifes you if the object was disposed (you removed the last
> referernce) or not.

Exactly.

Okay I realized virObjectUnref has ret value, which no one used it
earlier or I missed them all.
Me bad hacker used the same way. I will start using the ret value.

The best part is, I did not notice your comment above and bugged the
code, after many tries finally found virObjectUnref has ret, when I
stared reading the email for reply I found that you already gave the
clue above. Poor me!

>
>> +                if (priv && priv->canonpath)
>> +                    VIR_FREE(priv->canonpath);
>> +                goto unlock;
>> +            }
>> +            virStorageBackendGlusterCacheConnCleaned = false;
>> +            VIR_DELETE_ELEMENT(virStorageBackendGlusterGlobalCacheObj->conn, i,
>> +                               virStorageBackendGlusterGlobalCacheObj->nConn);
>> +            VIR_FREE(src->drv);
>
> This leaks src->drv in cases where the entry was used by two objects
> simultaneously. This needs to be done in the calling function. (As also
> pointed out in one of the other patches)

right, Will make this part of virStorageFileDeinit.

>
>> +        }
>> +    }
>> +
>> + unlock:
>> +    virMutexUnlock(&virStorageBackendGlusterGlobalCacheObj->lock);
>> +}
>>
>>  static void
>>  virStorageBackendGlusterClose(virStorageBackendGlusterStatePtr state)
>> @@ -538,32 +749,15 @@ virStorageBackend virStorageBackendGluster = {
>>  };
>>
>>
>> -typedef struct _virStorageFileBackendGlusterPriv virStorageFileBackendGlusterPriv;
>> -typedef virStorageFileBackendGlusterPriv *virStorageFileBackendGlusterPrivPtr;
>> -
>> -struct _virStorageFileBackendGlusterPriv {
>> -    glfs_t *vol;
>> -    char *canonpath;
>> -};
>> -
>> -
>>  static void
>>  virStorageFileBackendGlusterDeinit(virStorageSourcePtr src)
>>  {
>> -    virStorageFileBackendGlusterPrivPtr priv = src->drv->priv;
>> -
>>      VIR_DEBUG("deinitializing gluster storage file %p (gluster://%s:%s/%s%s)",
>>                src, src->hosts->u.inet.addr,
>>                src->hosts->u.inet.port ? src->hosts->u.inet.port : "0",
>>                src->volume, src->path);
>>
>> -    if (priv->vol)
>> -        glfs_fini(priv->vol);
>> -    VIR_FREE(priv->canonpath);
>> -
>> -    VIR_FREE(priv);
>> -
>> -    VIR_FREE(src->drv);
>> +    virStorageBackendGlusterCacheRefresh(src);
>>  }
>>
>>  static int
>> @@ -610,6 +804,20 @@ virStorageFileBackendGlusterInitServer(virStorageFileBackendGlusterPrivPtr priv,
>>      return 0;
>>  }
>>
>> +static void
>> +virStorageBackendGlusterCacheInit(void)
>> +{
>> +    if (VIR_ALLOC(virStorageBackendGlusterGlobalCacheObj) < 0) {
>> +        virStorageBackendGlusterCacheInitGlobalError = false;
>> +        return;
>> +    }
>> +
>> +    if (virMutexInit(&virStorageBackendGlusterGlobalCacheObj->lock) < 0) {
>> +        virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
>> +                       _("Unable to initialize mutex"));
>> +        virStorageBackendGlusterCacheInitGlobalError = false;
>> +    }
>> +}
>>
>>  static int
>>  virStorageFileBackendGlusterInit(virStorageSourcePtr src)
>> @@ -632,11 +840,32 @@ virStorageFileBackendGlusterInit(virStorageSourcePtr src)
>>                src, priv, src->volume, src->path,
>>                (unsigned int)src->drv->uid, (unsigned int)src->drv->gid);
>>
>> +    if (virOnce(&virStorageBackendGlusterCacheOnce,
>> +                virStorageBackendGlusterCacheInit) < 0)
>> +        return -1;
>> +
>> +    if (virStorageBackendGlusterCacheInitGlobalError)
>> +        return -1;
>> +
>> +    virMutexLock(&virStorageBackendGlusterGlobalCacheObj->lock);
>> +
>> +    priv->vol = virStorageBackendGlusterCacheQuery(src);
>> +    if (priv->vol) {
>> +        src->drv->priv = priv;
>> +        virMutexUnlock(&virStorageBackendGlusterGlobalCacheObj->lock);
>> +        return 0;
>> +    }
>> +
>>      if (!(priv->vol = glfs_new(src->volume))) {
>>          virReportOOMError();
>> -        goto error;
>> +        goto unlock;
>>      }
>>
>> +    if (virStorageBackendGlusterCacheAdd(src, priv) < 0)
>> +        goto unlock;
>> +
>> +    virMutexUnlock(&virStorageBackendGlusterGlobalCacheObj->lock);
>> +
>
> So yet another thread-unsafe part. You correctly add the not-yet-opened
> connection to the cache, but there is no mechanism to guard suff after
> this.
>
> If a second thread obtains the cache entry wile the first one is not yet
> finished opening the connection it will get an invalid one.
>
> Any thread obtaining an entry from the cache needs to check whether the
> conection was initialized already as the first thing to do. It can
> continue only if it's already open and otherwise needs to wait until the
> original thread finishes.
>
> The first thread that added the object into the cache needs to open the
> connection and notify all other threads potentialy waiting on the
> condition that it was opened. A virCond should help.

Thanks for the clue, will need to figure out how to use virCond.

>
>>      for (i = 0; i < src->nhosts; i++) {
>>          if (virStorageFileBackendGlusterInitServer(priv, src->hosts + i) < 0)
>>              goto error;

Apologies for the delay.
Got a switch to internal priority item in gluster.

Will try to work on the reviews tomorrow.

Thanks Peter!

--
Prasanna




More information about the libvir-list mailing list