[Linux-cluster] GFS2 - monitoring the rate of Posix lock operations

Mon Mar 29 08:41:29 UTC 2010

On Sun, 2010-03-28 at 02:32 +0000, Jankowski, Chris wrote:
> Steve,
> 
> Q2:
> >>> Are you sure that the workload isn't causing too many cache invalidations due to sharing files/directories between nodes? This is the most usual cause of poor performance.
> 
> The other node is completely idle and kept that way by design. Users are connecting through an IP alias managed by the appplication service. Application administrators also log in through the alias to do their maintenance work. In the case of this particular test I manually listed what is running where. I am very concious of the fact that accesses from multiple nodes invalidate local in-memory caching.
> 
> Q3:
> >>> Have you used the noatime mount option? If you can use it, its highly recommended. Also turn off selinux if that is running on the GFS2 filesystem.
> 
> The filesystem is mounted with noatime and no nodiratime options. SELinux is disabled.
> 
nodiratime isn't supported, noatime is enough.

> Q4:
> >>>Potentially there might be. I don't know enough about the application to say, but it depends on how the workload can be arranged,
> 
> The application runs on one node at a time.  It has to, as it uses shared memory. The application uses a database of indexed files. There are thousands of them. Also, it uses standard UNIX flile locking and range locking.
> 
> What else can I do to minimise the GFS2 locking overhead in this asymetrical configuration.
> 
You can use localflocks on each node provided you never access any of
the locked files from more than once node at once (which may be true
depending on how the failover is designed). Then you will get local
fcntl lock performance at the expense of cluster fcntl locks.

> Q5:
> Is this the case that when gfs_controld gets to 100% of one coe DPU usage then this is a hard limit on the number of Posix locks taken.  Is there only one gfs_lockd daemon servicng all GFS2 filesystems or are they run on a per filesystems basis?  In the latter case I would have thought that breaking the one filesystem that I have into several may help. Would it not?
> 
> Thanks and regards,
> 
> Chris
> 
Assuming that you have a version in which gfs_controld takes care of the
locking (newer GFS2 send the locks via dlm_controld) then yes, that will
provide a hard limit on the rate at which locks can be acquired/dropped,

Steve.