[Linux-cluster] GS2 try_rgrp_unlink consuming lots of CPU

Tue Oct 27 09:57:06 UTC 2009

Hi,

On Mon, 2009-10-26 at 15:47 -0700, Miller, Gordon K wrote:
> When making our GFS2 filesystems we are using default values with the exception of the journal size which we have set to 16MB. Our resource groups are 443 MB in size for this filesystem.
> 
> I do not believe that we have the case of unlinking inodes from one node while it is still open on another.
> 
> Under what conditions would try_rgrp_unlink return the same inode when called repeatedly in a short time frame as seen in the original problem description? I am unable to correlate any call to gfs2_unlink on any node in the cluster with the inodes that try_rgrp_unlink is returning.
> 
> Gordon
> 
It depends which kernel version you have. In earlier kernels it tried to
deallocate inodes in an rgrp only once for each mount of the filesystem.
That proved to cause a problem for some configurations where we were not
aggressive enough in reclaiming free space. As a result, the algorithm
was updated to scan more often.

However in both cases, it was designed to always make progress and not
continue to rescan the same inode, so something very odd is going on.
The only reason that an inode would be repeatedly scanned is that it has
been unlinked somewhere (since the scanning is looking only for unlinked
inodes) and cannot be deallocated for some reason (i.e. still in use)
and thus is still there when the next scan comes along.

Steve.