[Linux-cachefs] [PATCH] CacheFiles: Fix a race in cachefiles_delete_object() vs rename
Mark Moseley
moseleymark at gmail.com
Wed Feb 24 19:46:08 UTC 2010
On Fri, Feb 19, 2010 at 10:14 AM, David Howells <dhowells at redhat.com> wrote:
> cachefiles_delete_object() can race with rename. It gets the parent directory
> of the object it's asked to delete, then locks it - but rename may have changed
> the object's parent between the get and the completion of the lock.
>
> However, if such a circumstance is detected, we abandon our attempt to delete
> the object - since it's no longer in the index key path, it won't be seen
> again by lookups of that key. The assumption is that cachefilesd may have
> culled it by renaming it to the graveyard for later destruction.
>
> Signed-off-by: David Howells <dhowells at redhat.com>
> ---
>
> fs/cachefiles/namei.c | 12 +++++++++++-
> 1 files changed, 11 insertions(+), 1 deletions(-)
>
> diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
> index 14ac480..eeb4986 100644
> --- a/fs/cachefiles/namei.c
> +++ b/fs/cachefiles/namei.c
> @@ -348,7 +348,17 @@ int cachefiles_delete_object(struct cachefiles_cache *cache,
> dir = dget_parent(object->dentry);
>
> mutex_lock_nested(&dir->d_inode->i_mutex, I_MUTEX_PARENT);
> - ret = cachefiles_bury_object(cache, dir, object->dentry);
> +
> + /* we need to check that our parent is _still_ our parent - it may have
> + * been renamed */
> + if (dir == object->dentry->d_parent) {
> + ret = cachefiles_bury_object(cache, dir, object->dentry);
> + } else {
> + /* it got moved, presumably by cachefilesd culling it, so it's
> + * no longer in the key path and we can ignore it */
> + mutex_unlock(&dir->d_inode->i_mutex);
> + ret = 0;
> + }
>
> dput(dir);
> _leave(" = %d", ret);
>
Dunno if this patch was related to the other thread where I mentioned
the "Unlink failed" error. But since it's coming from the called
function -- cachefiles_bury_object -- of the patched function I'm
guessing it's related, so I figured I'd mention that with 2.6.32.9
(with the above patch applied) this morning, it's still giving the
same error:
Feb 24 13:50:31 server kernel: [ 3.064935] FS-Cache: Loaded
Feb 24 13:50:31 server kernel: [ 3.082345] CacheFiles: Loaded
Feb 24 13:50:31 server kernel: [ 6.884175] FS-Cache: Netfs 'nfs'
registered for caching
Feb 24 13:50:34 server kernel: [ 78.658626] FS-Cache: Cache
"mycache" added (type cachefiles)
Feb 24 13:50:34 server kernel: [ 78.658631] CacheFiles: File cache
on sdb1 registered
Feb 24 14:10:43 server kernel: [ 1287.755925] CacheFiles: I/O Error:
Unlink failed
Feb 24 14:10:43 server kernel: [ 1287.755930] FS-Cache: Cache
cachefiles stopped due to I/O error
At the time of the error, the cache had close to 1gb in it:
/dev/sdb1 14428928 987852 12708112 8% /var/cache/fscache
I had looked a bit earlier at the number of files and it was over 35k
at that point (so was probably another 0-10k files by the time the
error occurred). Let me know if there's any info I can provide that
would be helpful.
More information about the Linux-cachefs
mailing list