[libvirt] [PATCH v4 4/4] gluster: cache glfs connection object per volume

Prasanna Kalever pkalever at redhat.com
Wed Dec 14 19:48:22 UTC 2016


On Mon, Dec 12, 2016 at 8:02 PM, Prasanna Kalever <pkalever at redhat.com> wrote:
> On Wed, Dec 7, 2016 at 10:38 PM, Peter Krempa <pkrempa at redhat.com> wrote:
>> On Tue, Dec 06, 2016 at 22:52:01 +0530, Prasanna Kumar Kalever wrote:
>>> Currently, in case if we have 4 extra attached disks, then for each disk
>>> we need to call 'glfs_init' (over network) and friends which could be costly.
>>>
>>> Additionally snapshot(external) scenario will further complex the situation.
>>
>> This part was extracted to a separate patch already, thus this doesn't
>> have to deal with snapshots in any way than with other gluster
>> connections.
>
> Yeah!

Sorry, this still holds good. on taking a snapshot
virStorageFileGetMetadataRecurse will be called on each overlay.
Which seeks/hits the cache help while the virStorageFileInitAs
respective calls (In case if snaps reside on the same volume).

Thanks,
--
Prasanna

>
>>
>>> This patch maintain a cache of glfs objects per volume, hence the
>>> all disks/snapshots belonging to the volume whose glfs object is cached
>>> can avoid calls to 'glfs_init' and friends by simply taking a ref on
>>> existing object.
>>>
>>> The glfs object is shared only if volume name and all the volfile servers match
>>> (includes hostname, transport and port number).
>>>
>>> Thanks to 'Peter Krempa' for all the inputs.
>>>
>>> Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever at redhat.com>
>>> ---
>>>  src/storage/storage_backend_gluster.c | 279 +++++++++++++++++++++++++++++++---
>>>  1 file changed, 254 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/src/storage/storage_backend_gluster.c b/src/storage/storage_backend_gluster.c
>>> index e0841ca..9f633ac 100644
>>> --- a/src/storage/storage_backend_gluster.c
>>> +++ b/src/storage/storage_backend_gluster.c
>>> @@ -36,6 +36,20 @@
>>>
>>>  VIR_LOG_INIT("storage.storage_backend_gluster");
>>>
>>> +
>>> +typedef struct _virStorageBackendGlusterState virStorageBackendGlusterState;
>>> +typedef virStorageBackendGlusterState *virStorageBackendGlusterStatePtr;
>>> +
>>> +typedef struct _virStorageFileBackendGlusterPriv virStorageFileBackendGlusterPriv;
>>> +typedef virStorageFileBackendGlusterPriv *virStorageFileBackendGlusterPrivPtr;
>>> +
>>> +typedef struct _virStorageBackendGlusterCache virStorageBackendGlusterCache;
>>> +typedef virStorageBackendGlusterCache *virStorageBackendGlusterCachePtr;
>>> +
>>> +typedef struct _virStorageBackendGlusterCacheConn virStorageBackendGlusterCacheConn;
>>> +typedef virStorageBackendGlusterCacheConn *virStorageBackendGlusterCacheConnPtr;
>>> +
>>> +
>>>  struct _virStorageBackendGlusterState {
>>>      glfs_t *vol;
>>>
>>> @@ -47,8 +61,205 @@ struct _virStorageBackendGlusterState {
>>>      char *dir; /* dir from URI, or "/"; always starts and ends in '/' */
>>>  };
>>>
>>> -typedef struct _virStorageBackendGlusterState virStorageBackendGlusterState;
>>> -typedef virStorageBackendGlusterState *virStorageBackendGlusterStatePtr;
>>> +struct _virStorageFileBackendGlusterPriv {
>>> +    glfs_t *vol;
>>
>> You should also keep a reference to the object in the cache, so that you
>> can return it and don't have to look it up again.
>>
>>> +    char *canonpath;
>>> +};
>>> +
>>> +struct _virStorageBackendGlusterCacheConn {
>>> +    virObjectLockable parent;
>>> +    virStorageFileBackendGlusterPrivPtr priv;
>>
>> You don't really need the complete object here. you only need the glfs_t
>> part here. The 'cannonpath' variable is different for possible storage
>> source objects.
>
> got it.
>
>>
>>> +    size_t nhosts;
>>> +    virStorageNetHostDefPtr hosts;
>>> +    char *volname;
>>> +};
>>> +
>>> +struct _virStorageBackendGlusterCache {
>>> +    virMutex lock;
>>> +    size_t nConn;
>>> +    virStorageBackendGlusterCacheConnPtr *conn;
>>> +};
>>> +
>>> +
>>> +static virStorageBackendGlusterCachePtr virStorageBackendGlusterGlobalCacheObj;
>>> +static virClassPtr virStorageBackendGlusterCacheConnClass;
>>> +static virOnceControl virStorageBackendGlusterCacheOnce = VIR_ONCE_CONTROL_INITIALIZER;
>>> +static bool virStorageBackendGlusterCacheInitGlobalError;
>>> +static bool virStorageBackendGlusterCacheConnCleaned;
>>
>> As said. Don't use global flags. You should be able to avoid both of
>> them.
>>
>> Okay. I've seen below. They not only are avoidable, but your usage is
>> outright dangerous.
>
> Okay.
>
>>
>>> +
>>> +
>>> +static void virStorageBackendGlusterCacheConnDispose(void *obj);
>>> +
>>> +
>>> +static int virStorageBackendGlusterCacheConnOnceInit(void)
>>> +{
>>> +    if (!(virStorageBackendGlusterCacheConnClass =
>>> +                                virClassNew(virClassForObjectLockable(),
>>> +                                "virStorageBackendGlusterCacheConnClass",
>>> +                                sizeof(virStorageBackendGlusterCacheConn),
>>> +                                virStorageBackendGlusterCacheConnDispose)))
>>
>> I've noticed that you actually create a lockable object here, but don't
>> use the locks. This is currently okay, since it's write only. The lock
>> might come handy though ...
>>
>>> +        return -1;
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static void virStorageBackendGlusterCacheConnDispose(void *object)
>>> +{
>>> +    virStorageBackendGlusterCacheConnPtr conn = object;
>>> +
>>> +    glfs_fini(conn->priv->vol);
>>> +
>>> +    VIR_FREE(conn->priv->canonpath);
>>> +    VIR_FREE(conn->priv);
>>> +    VIR_FREE(conn->volname);
>>> +
>>> +    virStorageNetHostDefFree(conn->nhosts, conn->hosts);
>>> +
>>> +    /* all set to delete the connection from cache */
>>> +    virStorageBackendGlusterCacheConnCleaned = true;
>>> +}
>>> +
>>> +VIR_ONCE_GLOBAL_INIT(virStorageBackendGlusterCacheConn);
>>> +
>>> +static bool
>>> +virStorageBackendGlusterCompareHosts(size_t nSrcHosts,
>>> +                                     virStorageNetHostDefPtr srcHosts,
>>> +                                     size_t nDstHosts,
>>> +                                     virStorageNetHostDefPtr dstHosts)
>>> +{
>>> +    bool match = false;
>>> +    size_t i;
>>> +    size_t j;
>>> +
>>> +    if (nSrcHosts != nDstHosts)
>>> +        return false;
>>> +    for (i = 0; i < nSrcHosts; i++) {
>>> +        for (j = 0; j < nDstHosts; j++) {
>>> +            if (srcHosts[i].type != dstHosts[j].type)
>>> +                continue;
>>> +            switch (srcHosts[i].type) {
>>> +                case VIR_STORAGE_NET_HOST_TRANS_UNIX:
>>> +                    if (STREQ(srcHosts[i].u.uds.path, dstHosts[j].u.uds.path))
>>> +                        match = true;
>>> +                    break;
>>> +                case VIR_STORAGE_NET_HOST_TRANS_TCP:
>>> +                case VIR_STORAGE_NET_HOST_TRANS_RDMA:
>>> +                    if (STREQ(srcHosts[i].u.inet.addr, dstHosts[j].u.inet.addr)
>>> +                                               &&
>>> +                        STREQ(srcHosts[i].u.inet.port, dstHosts[j].u.inet.port))
>>> +                        match = true;
>>> +                    break;
>>> +                case VIR_STORAGE_NET_HOST_TRANS_LAST:
>>> +                    break;
>>> +            }
>>> +            if (match)
>>> +                break; /* out of inner for loop */
>>> +        }
>>> +        if (!match)
>>> +            return false;
>>> +        match = false; /* reset */
>>> +    }
>>> +    return true;
>>> +}
>>> +
>>> +static int
>>> +virStorageBackendGlusterCacheAdd(virStorageSourcePtr src,
>>> +                                 virStorageFileBackendGlusterPrivPtr priv)
>>> +{
>>> +    virStorageBackendGlusterCacheConnPtr newConn = NULL;
>>> +    virStorageBackendGlusterCacheConnPtr conn = NULL;
>>> +    size_t i;
>>> +
>>> +    for (i = 0; i < virStorageBackendGlusterGlobalCacheObj->nConn; i++) {
>>
>> Umm ... you iterate the cache ...
>
> This is the problem with changing the code in the last minute.
> I have edited something after I did git format-patch.
> Will address them.
>
>>
>>> +        conn = virStorageBackendGlusterGlobalCacheObj->conn[i];
>>> +        if (STRNEQ(conn->volname, src->volume))
>>> +            continue;
>>> +        if (virStorageBackendGlusterCompareHosts(conn->nhosts, conn->hosts,
>>> +                                                 src->nhosts, src->hosts)) {
>>> +            return 0;
>>> +        }
>>> +    }
>>> +
>>> +    if (virStorageBackendGlusterCacheConnInitialize() < 0)
>>> +        return -1;
>>
>> ... before attempting to initialize the global cache?!? That doesn't
>> make sense. Also this call is not needed here, since the only entrypoint
>> is virStorageFileBackendGlusterInit which initializes it.
>>
>>> +
>>> +    if (!(newConn = virObjectLockableNew(virStorageBackendGlusterCacheConnClass)))
>>> +        return -1;
>>> +
>>> +    if (VIR_STRDUP(newConn->volname, src->volume) < 0)
>>> +        goto error;
>>> +
>>> +    newConn->nhosts = src->nhosts;
>>> +    if (!(newConn->hosts = virStorageNetHostDefCopy(src->nhosts, src->hosts)))
>>> +        goto error;
>>> +    newConn->priv = priv;
>>> +
>>> +    if (VIR_APPEND_ELEMENT(virStorageBackendGlusterGlobalCacheObj->conn,
>>> +                           virStorageBackendGlusterGlobalCacheObj->nConn,
>>> +                           newConn) < 0)
>>> +        goto error;
>>> +
>>> +    return 0;
>>> +
>>> + error:
>>> +    virStorageNetHostDefFree(newConn->nhosts, newConn->hosts);
>>> +    VIR_FREE(newConn->volname);
>>
>> You leak newConn in some cases.
>
> Unref!
>
>>
>>> +    return -1;
>>> +}
>>> +
>>> +static glfs_t *
>>> +virStorageBackendGlusterCacheQuery(virStorageSourcePtr src)
>>
>> We usually use "Lookup".
>
> good one. My naming skills are pathetic. Mercy me ...
>
>>
>>> +{
>>> +    virStorageBackendGlusterCacheConnPtr conn;
>>> +    size_t i;
>>> +
>>> +    if (virStorageBackendGlusterGlobalCacheObj == NULL)
>>> +        return NULL;
>>> +
>>> +    for (i = 0; i < virStorageBackendGlusterGlobalCacheObj->nConn; i++) {
>>> +        conn = virStorageBackendGlusterGlobalCacheObj->conn[i];
>>> +        if (STRNEQ(conn->volname, src->volume))
>>> +            continue;
>>> +        if (virStorageBackendGlusterCompareHosts(conn->nhosts, conn->hosts,
>>> +                                                 src->nhosts, src->hosts)) {
>>> +            virObjectRef(conn);
>>> +            return conn->priv->vol;
>>> +        }
>>> +    }
>>> +
>>> +    return NULL;
>>> +}
>>> +
>>> +static void
>>> +virStorageBackendGlusterCacheRefresh(virStorageSourcePtr src)
>>
>> This name is weird. Did you mean "Release"?
>
> As you say.
>
>>
>>> +{
>>> +    virStorageFileBackendGlusterPrivPtr priv = src->drv->priv;
>>> +    virStorageBackendGlusterCacheConnPtr conn = NULL;
>>> +    size_t i;
>>> +
>>> +    virMutexLock(&virStorageBackendGlusterGlobalCacheObj->lock);
>>> +    for (i = 0; i < virStorageBackendGlusterGlobalCacheObj->nConn; i++) {
>>> +        conn = virStorageBackendGlusterGlobalCacheObj->conn[i];
>>> +        if (STRNEQ(conn->volname, src->volume))
>>> +            continue;
>>> +        if (virStorageBackendGlusterCompareHosts(conn->nhosts, conn->hosts,
>>> +                                                 src->nhosts, src->hosts)) {
>>> +            virObjectUnref(conn);
>>> +            if (!virStorageBackendGlusterCacheConnCleaned) {
>>
>> Umm, nope, that's not how it works. This is totaly thread unsafe. Other
>> thread can modify your global flag.
>>
>> You probably did not check how virObjectUnref actually works. The return
>> value notifes you if the object was disposed (you removed the last
>> referernce) or not.
>
> Exactly.
>
> Okay I realized virObjectUnref has ret value, which no one used it
> earlier or I missed them all.
> Me bad hacker used the same way. I will start using the ret value.
>
> The best part is, I did not notice your comment above and bugged the
> code, after many tries finally found virObjectUnref has ret, when I
> stared reading the email for reply I found that you already gave the
> clue above. Poor me!
>
>>
>>> +                if (priv && priv->canonpath)
>>> +                    VIR_FREE(priv->canonpath);
>>> +                goto unlock;
>>> +            }
>>> +            virStorageBackendGlusterCacheConnCleaned = false;
>>> +            VIR_DELETE_ELEMENT(virStorageBackendGlusterGlobalCacheObj->conn, i,
>>> +                               virStorageBackendGlusterGlobalCacheObj->nConn);
>>> +            VIR_FREE(src->drv);
>>
>> This leaks src->drv in cases where the entry was used by two objects
>> simultaneously. This needs to be done in the calling function. (As also
>> pointed out in one of the other patches)
>
> right, Will make this part of virStorageFileDeinit.
>
>>
>>> +        }
>>> +    }
>>> +
>>> + unlock:
>>> +    virMutexUnlock(&virStorageBackendGlusterGlobalCacheObj->lock);
>>> +}
>>>
>>>  static void
>>>  virStorageBackendGlusterClose(virStorageBackendGlusterStatePtr state)
>>> @@ -538,32 +749,15 @@ virStorageBackend virStorageBackendGluster = {
>>>  };
>>>
>>>
>>> -typedef struct _virStorageFileBackendGlusterPriv virStorageFileBackendGlusterPriv;
>>> -typedef virStorageFileBackendGlusterPriv *virStorageFileBackendGlusterPrivPtr;
>>> -
>>> -struct _virStorageFileBackendGlusterPriv {
>>> -    glfs_t *vol;
>>> -    char *canonpath;
>>> -};
>>> -
>>> -
>>>  static void
>>>  virStorageFileBackendGlusterDeinit(virStorageSourcePtr src)
>>>  {
>>> -    virStorageFileBackendGlusterPrivPtr priv = src->drv->priv;
>>> -
>>>      VIR_DEBUG("deinitializing gluster storage file %p (gluster://%s:%s/%s%s)",
>>>                src, src->hosts->u.inet.addr,
>>>                src->hosts->u.inet.port ? src->hosts->u.inet.port : "0",
>>>                src->volume, src->path);
>>>
>>> -    if (priv->vol)
>>> -        glfs_fini(priv->vol);
>>> -    VIR_FREE(priv->canonpath);
>>> -
>>> -    VIR_FREE(priv);
>>> -
>>> -    VIR_FREE(src->drv);
>>> +    virStorageBackendGlusterCacheRefresh(src);
>>>  }
>>>
>>>  static int
>>> @@ -610,6 +804,20 @@ virStorageFileBackendGlusterInitServer(virStorageFileBackendGlusterPrivPtr priv,
>>>      return 0;
>>>  }
>>>
>>> +static void
>>> +virStorageBackendGlusterCacheInit(void)
>>> +{
>>> +    if (VIR_ALLOC(virStorageBackendGlusterGlobalCacheObj) < 0) {
>>> +        virStorageBackendGlusterCacheInitGlobalError = false;
>>> +        return;
>>> +    }
>>> +
>>> +    if (virMutexInit(&virStorageBackendGlusterGlobalCacheObj->lock) < 0) {
>>> +        virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
>>> +                       _("Unable to initialize mutex"));
>>> +        virStorageBackendGlusterCacheInitGlobalError = false;
>>> +    }
>>> +}
>>>
>>>  static int
>>>  virStorageFileBackendGlusterInit(virStorageSourcePtr src)
>>> @@ -632,11 +840,32 @@ virStorageFileBackendGlusterInit(virStorageSourcePtr src)
>>>                src, priv, src->volume, src->path,
>>>                (unsigned int)src->drv->uid, (unsigned int)src->drv->gid);
>>>
>>> +    if (virOnce(&virStorageBackendGlusterCacheOnce,
>>> +                virStorageBackendGlusterCacheInit) < 0)
>>> +        return -1;
>>> +
>>> +    if (virStorageBackendGlusterCacheInitGlobalError)
>>> +        return -1;
>>> +
>>> +    virMutexLock(&virStorageBackendGlusterGlobalCacheObj->lock);
>>> +
>>> +    priv->vol = virStorageBackendGlusterCacheQuery(src);
>>> +    if (priv->vol) {
>>> +        src->drv->priv = priv;
>>> +        virMutexUnlock(&virStorageBackendGlusterGlobalCacheObj->lock);
>>> +        return 0;
>>> +    }
>>> +
>>>      if (!(priv->vol = glfs_new(src->volume))) {
>>>          virReportOOMError();
>>> -        goto error;
>>> +        goto unlock;
>>>      }
>>>
>>> +    if (virStorageBackendGlusterCacheAdd(src, priv) < 0)
>>> +        goto unlock;
>>> +
>>> +    virMutexUnlock(&virStorageBackendGlusterGlobalCacheObj->lock);
>>> +
>>
>> So yet another thread-unsafe part. You correctly add the not-yet-opened
>> connection to the cache, but there is no mechanism to guard suff after
>> this.
>>
>> If a second thread obtains the cache entry wile the first one is not yet
>> finished opening the connection it will get an invalid one.
>>
>> Any thread obtaining an entry from the cache needs to check whether the
>> conection was initialized already as the first thing to do. It can
>> continue only if it's already open and otherwise needs to wait until the
>> original thread finishes.
>>
>> The first thread that added the object into the cache needs to open the
>> connection and notify all other threads potentialy waiting on the
>> condition that it was opened. A virCond should help.
>
> Thanks for the clue, will need to figure out how to use virCond.
>
>>
>>>      for (i = 0; i < src->nhosts; i++) {
>>>          if (virStorageFileBackendGlusterInitServer(priv, src->hosts + i) < 0)
>>>              goto error;
>
> Apologies for the delay.
> Got a switch to internal priority item in gluster.
>
> Will try to work on the reviews tomorrow.
>
> Thanks Peter!
>
> --
> Prasanna




More information about the libvir-list mailing list