[Linux-cluster] Freeze with cluster-2.03.11
Wendy Cheng
s.wendy.cheng at gmail.com
Fri Apr 3 14:15:39 UTC 2009
Kadlecsik Jozsef wrote:
> On Thu, 2 Apr 2009, Wendy Cheng wrote:
>
>
>>>> Kadlecsik Jozsef wrote:
>>>>
>>>>
>>>>> - commit 82d176ba485f2ef049fd303b9e41868667cebbdb
>>>>> gfs_drop_inode as .drop_inode replacing .put_inode.
>>>>> .put_inode was called without holding a lock, but .drop_inode
>>>>> is called under inode_lock held. Might it be a problem
>>>>>
>>>>>
>> Based on code reading ...
>> 1. iput() gets inode_lock (a spin lock)
>> 2. iput() calls iput_final()
>> 3. iput_final() calls filesystem drop_inode(), followed by
>> generic_drop_inode()
>> 4. generic_drop_inode() unlock inode_lock after doing all sorts of fun things
>> with the inode
>>
>> So look to me that generic_drop_inode() statement within
>> gfs_drop_inode() should be removed. Otherwise you would get double
>> unlock and double list free.
>>
>
> I think those function calls are right: iput_final calls either the
> filesystem drop_inode function (in this case gfs_drop_inode) or
> generic_drop_inode. There's no double call of generic_drop_inode. However
> gfs_sync_page_i (and in turn filemap_fdatawrite and filemap_fdatawait) is
> now called under inode_lock held and that was not so in previous versions.
> But I'm just speculating.
>
It *is* called twice unless my eyes deceive me
static inline void iput_final(struct inode *inode)
{
const struct super_operations *op = inode->i_sb->s_op;
void (*drop)(struct inode *) = generic_drop_inode;
if (op && op->drop_inode)
drop = op->drop_inode; /* gfs call generic_drop_inode() */
drop(inode); /* second call into generic_drop_inode() again. */
}
>
>
>> In short, *remove* line #73 from gfs-kernel/src/gfs/ops_super.c in your
>> source and let us know how it goes.
>>
>
> I won't get a chance to start a test before Monday, sorry.
>
>
I'll be traveling next week as well. However, a few cautious words here:
Even this "fix" eventually solves your hang, running GFS on newer
kernels with production system simply is *not* a good idea.
-- Wendy
More information about the Linux-cluster
mailing list