[Linux-cluster] Freeze with cluster-2.03.11

Wendy Cheng s.wendy.cheng at gmail.com
Mon Apr 6 04:12:34 UTC 2009


>
> Then don't remove it yet. The ramification needs more thoughts ...
>

That generic_drop_inode() can *not* be removed.

Not sure whether my head is clear enough this time ....

Based on code reading ...
1. iput() gets inode_lock (a spin lock)
2. iput() calls iput_final()
3. iput_final() calls gfs_drop_inode() that calls
    generic_drop_inode()
4. generic_drop_inode() unlocks inode_lock.

In theory, this logic violates the usage of spin lock as it is expected 
to lock for a short period of time but gfs_drop_inode() could take a 
while to finish. It has a blocking write page that need to make sure the 
data gets sync-ed to storage before it can returns. Make matter worse is 
that inode_lock is a global lock that could block non-GFS threads. One 
would think a quick fix is to drop the inode_lock at the beginning of 
gfs_drop_inode() and then re-acquires it after gfs sync the page. 
Unfortunately, inode_lock is not an exported symbol and GFS is an 
out-of-tree filesystem that has to be compiled as a kernel module. So 
this trick won't work for GFS.

With a flight to catch tomorrow and a flu-infected body, I lose the will 
to think over what the correct fix should and/or will be.

-- Wendy




More information about the Linux-cluster mailing list