[Linux-cachefs] [PATCH] CacheFiles: Fix a race in cachefiles_delete_object() vs rename

Mark Moseley moseleymark at gmail.com
Wed Feb 24 19:46:08 UTC 2010


On Fri, Feb 19, 2010 at 10:14 AM, David Howells <dhowells at redhat.com> wrote:
> cachefiles_delete_object() can race with rename.  It gets the parent directory
> of the object it's asked to delete, then locks it - but rename may have changed
> the object's parent between the get and the completion of the lock.
>
> However, if such a circumstance is detected, we abandon our attempt to delete
> the object - since it's no longer in the index key path, it won't be seen
> again by lookups of that key.  The assumption is that cachefilesd may have
> culled it by renaming it to the graveyard for later destruction.
>
> Signed-off-by: David Howells <dhowells at redhat.com>
> ---
>
>  fs/cachefiles/namei.c |   12 +++++++++++-
>  1 files changed, 11 insertions(+), 1 deletions(-)
>
> diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
> index 14ac480..eeb4986 100644
> --- a/fs/cachefiles/namei.c
> +++ b/fs/cachefiles/namei.c
> @@ -348,7 +348,17 @@ int cachefiles_delete_object(struct cachefiles_cache *cache,
>        dir = dget_parent(object->dentry);
>
>        mutex_lock_nested(&dir->d_inode->i_mutex, I_MUTEX_PARENT);
> -       ret = cachefiles_bury_object(cache, dir, object->dentry);
> +
> +       /* we need to check that our parent is _still_ our parent - it may have
> +        * been renamed */
> +       if (dir == object->dentry->d_parent) {
> +               ret = cachefiles_bury_object(cache, dir, object->dentry);
> +       } else {
> +               /* it got moved, presumably by cachefilesd culling it, so it's
> +                * no longer in the key path and we can ignore it */
> +               mutex_unlock(&dir->d_inode->i_mutex);
> +               ret = 0;
> +       }
>
>        dput(dir);
>        _leave(" = %d", ret);
>

Dunno if this patch was related to the other thread where I mentioned
the "Unlink failed" error. But since it's coming from the called
function -- cachefiles_bury_object -- of the patched function I'm
guessing it's related, so I figured I'd mention that with 2.6.32.9
(with the above patch applied) this morning, it's still giving the
same error:

Feb 24 13:50:31 server kernel: [    3.064935] FS-Cache: Loaded
Feb 24 13:50:31 server kernel: [    3.082345] CacheFiles: Loaded
Feb 24 13:50:31 server kernel: [    6.884175] FS-Cache: Netfs 'nfs'
registered for caching
Feb 24 13:50:34 server kernel: [   78.658626] FS-Cache: Cache
"mycache" added (type cachefiles)
Feb 24 13:50:34 server kernel: [   78.658631] CacheFiles: File cache
on sdb1 registered
Feb 24 14:10:43 server kernel: [ 1287.755925] CacheFiles: I/O Error:
Unlink failed
Feb 24 14:10:43 server kernel: [ 1287.755930] FS-Cache: Cache
cachefiles stopped due to I/O error

At the time of the error, the cache had close to 1gb in it:

/dev/sdb1             14428928    987852  12708112   8% /var/cache/fscache

I had looked a bit earlier at the number of files and it was over 35k
at that point (so was probably another 0-10k files by the time the
error occurred). Let me know if there's any info I can provide that
would be helpful.




More information about the Linux-cachefs mailing list