[Linux-cachefs] PROBLEM: ASSERT(object->dentry) fails in cachefiles_delete_object()

David Howells dhowells at redhat.com
Thu Sep 25 12:28:39 UTC 2014


Manuel Schölling <manuel.schoelling at gmx.de> wrote:

> [485208.579361] CacheFiles: Error: Unexpected object collision
> [485208.579364] object: OBJ1b354
> [485208.579367] objstate=LOOK_UP_OBJECT fl=8 wbusy=2 ev=0[0]
> [485208.579369] ops=0 inp=0 exc=0
> [485208.579371] parent=ffff88053f5417c0
> [485208.579373] cookie=ffff880538f202a0 [pr=ffff8805381b7160 nd=ffff880509c6eb78 fl=27]
> [485208.579375] key=[8] '2490000000000000'
> [485208.579381] xobject: OBJ1a600
> [485208.579384] xobjstate=DROP_OBJECT fl=70 wbusy=2 ev=0[0]
> [485208.579386] xops=0 inp=0 exc=0
> [485208.579387] xparent=ffff88053f5417c0
> [485208.579389] xcookie=ffff88050f4cbf70 [pr=ffff8805381b7160 nd=          (null) fl=12]

On the face of it, this looks like the first object should just be waiting for
the second.

The flags on the first object (fl=8) are:

	FSCACHE_OBJECT_IS_LIVE

and the flags on the second object (fl=70) are:

	FSCACHE_OBJECT_IS_LOOKED_UP
	FSCACHE_OBJECT_IS_AVAILABLE
	FSCACHE_OBJECT_RETIRED

I think that this test:

 -->	if (fscache_object_is_live(&object->fscache)) {
		pr_err("\n");
		pr_err("Error: Unexpected object collision\n");
		cachefiles_printk_object(object, xobject);
		BUG();
	}

is looking at the wrong object...

Also xobject->flags is 1, which is:

	CACHEFILES_OBJECT_ACTIVE

so we should just proceed to the part following the above if-statement where
we wait for this to be cleared.

Does this patch fix this oops for you?

David
---
commit cc0d3e7246ace3f5b695eb6a144461e041566f24
Author: David Howells <dhowells at redhat.com>
Date:   Thu Sep 25 11:10:06 2014 +0100

    CacheFiles: Fix incorrect test for in-memory object collision
    
    When CacheFiles cache objects are in use, they have in-memory representations,
    as defined by the cachefiles_object struct.  These are kept in a tree rooted in
    the cache and indexed by dentry pointer (since there's a unique mapping between
    object index key and dentry).
    
    Collisions can occur between a representation already in the tree and a new
    representation being set up because it takes time to dispose of an old
    representation - particularly if it must be unlinked or renamed.
    
    When such a collision occurs, cachefiles_mark_object_active() is meant to check
    to see if the old, already-present representation is in the process of being
    discarded (ie. FSCACHE_OBJECT_IS_LIVE is not set on it) - and, if so, wait for
    the representation to be removed (ie. CACHEFILES_OBJECT_ACTIVE is then
    cleared).
    
    However, the test for whether the old representation is still live is checking
    the new object - which always will be live at this point.  This leads to an
    oops looking like:
    
    	CacheFiles: Error: Unexpected object collision
    	object: OBJ1b354
    	objstate=LOOK_UP_OBJECT fl=8 wbusy=2 ev=0[0]
    	ops=0 inp=0 exc=0
    	parent=ffff88053f5417c0
    	cookie=ffff880538f202a0 [pr=ffff8805381b7160 nd=ffff880509c6eb78 fl=27]
    	key=[8] '2490000000000000'
    	xobject: OBJ1a600
    	xobjstate=DROP_OBJECT fl=70 wbusy=2 ev=0[0]
    	xops=0 inp=0 exc=0
    	xparent=ffff88053f5417c0
    	xcookie=ffff88050f4cbf70 [pr=ffff8805381b7160 nd=          (null) fl=12]
    	------------[ cut here ]------------
    	kernel BUG at fs/cachefiles/namei.c:200!
    	...
    	Workqueue: fscache_object fscache_object_work_func [fscache]
    	...
    	RIP: ... cachefiles_walk_to_object+0x7ea/0x860 [cachefiles]
    	...
    	Call Trace:
    	 [<ffffffffa04dadd8>] ? cachefiles_lookup_object+0x58/0x100 [cachefiles]
    	 [<ffffffffa01affe9>] ? fscache_look_up_object+0xb9/0x1d0 [fscache]
    	 [<ffffffffa01afc4d>] ? fscache_parent_ready+0x2d/0x80 [fscache]
    	 [<ffffffffa01b0672>] ? fscache_object_work_func+0x92/0x1f0 [fscache]
    	 [<ffffffff8107e82b>] ? process_one_work+0x16b/0x400
    	 [<ffffffff8107fc16>] ? worker_thread+0x116/0x380
    	 [<ffffffff8107fb00>] ? manage_workers.isra.21+0x290/0x290
    	 [<ffffffff81085edc>] ? kthread+0xbc/0xe0
    	 [<ffffffff81085e20>] ? flush_kthread_worker+0x80/0x80
    	 [<ffffffff81502d0c>] ? ret_from_fork+0x7c/0xb0
    	 [<ffffffff81085e20>] ? flush_kthread_worker+0x80/0x80
    
    Reported-by: Manuel Schölling <manuel.schoelling at gmx.de>
    Signed-off-by: David Howells <dhowells at redhat.com>

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 83e9c94ca2cf..edd0961c20df 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -189,7 +189,7 @@ try_again:
 	/* an old object from a previous incarnation is hogging the slot - we
 	 * need to wait for it to be destroyed */
 wait_for_old_object:
-	if (fscache_object_is_live(&object->fscache)) {
+	if (fscache_object_is_live(&xobject->fscache)) {
 		pr_err("\n");
 		pr_err("Error: Unexpected object collision\n");
 		cachefiles_printk_object(object, xobject);




More information about the Linux-cachefs mailing list