[Linux-cluster] Re: GFS2 delete file have bug in cluster mode

Tue Mar 11 09:55:32 UTC 2008

Hi,

On Tue, 2008-03-11 at 10:24 +0800, chenzp wrote:
> 
> Hi Steve
> 
> 
> 2008/3/10, Steven Whitehouse <swhiteho at redhat.com>:
>         Hi,
>         
>         
>         On Mon, 2008-03-10 at 16:08 +0800, chenzp wrote:
>         > hi all:
>         >
>         >       I have 2 nodes node1 and node2,use  kernel
>         2.6.22.15. .
>         >      the two node use clustat  command display is ok.
>         >      in now create volume  and use lock_dlm mode
>         mkfs.gf2.( mkfs.gfs2
>         > -p lock_dlm -t uitcluster:gfs2 -j 8 /dev/vg-test/lv-test )
>         and
>         > then
>         >      mount. (mount.gfs2 /dev/vg-test/lv-test /mnt/gfs2 )
>         >      user dd write file in node1(  dd if=/dev/zero
>         > of=/mnt/gfs2/$(hostname).1 bs=1M count=1024 ),when write
>         finish,check
>         > file in node2 is
>         >     ok!
>         >
>         >      but if delete node1.1 file in node2 ( rm
>         -rf /mnt/gfs2node1.1)
>         > and then use ls -l /mnt/gfs2 no file;and use ll
>         -h /mnt/gfs2/ no file
>         > the size
>         >     is  0;  and then use df -h check ,  the used no any
>         change;the
>         > Space no free.
>         >
>         >      my mean is: if you use node1 write file and then use
>         node2 delete
>         > node1 write file,and then use df -h check disk,the space no
>         free;
>         >      if you use node1 write file and then use node1 delete
>         node1 write
>         > file this is ok.
>         >
>         >
>         >
>         > carry.chen
>         >
>         
>         This is a bug from a while back. I suspect that if you do a
>         drop caches
>         then the space will magically reappear. If you use a more
>         recent kernel,
>         then this bug should also not occur.
>         
>         What is happening is that the dcache on one machine has kept
>         the inode
>         in cache even when another node has unlinked the inode because
>         it hasn't
>         seen the link count hit zero. The fix in more recent kernels
>         results in
>         the dentry for an unlinked inode being flushed cluster wide.
>         Once the
>         final iput has occurred, then the space is released to the
>         filesystem.
>         
>         Steve.
>  
> 
> 
> I have check  recent  shortlog in (/pub/scm /
> linux/kernel/git/steve/gfs2-2.6-nmw.git / shortlog) ,
> no find this bug changelog or  commit.
> 
> thanks!
> 
> Carry.chen

We had two goes at fixing this problem. Here is the latest one:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=49e61f2ef6f7d1d0296e3e30d366b28e0ca595c2

The previous one failed only due to calling back into the lock module
from the lock module's own thread,

Steve.