[Linux-cluster] GFS2 - monitoring the rate of Posix lock operations

Mon Mar 29 12:35:46 UTC 2010

Hi,

On Mon, 2010-03-29 at 12:15 +0000, Jankowski, Chris wrote:
> Steven,
> 
> >>>You can use localflocks on each node provided you never access any of the locked files from more than once node at once (which may be true depending on how the failover is designed). Then you will get local fcntl lock performance at the expense of cluster fcntl locks.
> 
> I could guarantee that only one node will use the filesystem by putting mount/unmount into the start/stop script for the application service. This is the easy part.
> 
> What I would like to understand is how the GFS2 recovery would look like after failure of the node that had the filesystem mounted. I'd guess that the local locks will be gone with the failed system and there is nothing to recover. The only thing to do would be to replay to transaction log from the failed system.  Is this correct?
> 
Yes, that is correct.

> This would work essentially like having a non-cluster filesystem such as ext3fs, but in case of recovery from a node failure doing only transaction log replay instaed of full fsck?  Or would fsck still be triggered on the attempt to mount the filesystem on the other node.?
> 
> Thanks and regards,
> 
> Chris
>  
Fsck would not be triggered on the mount attempt. Its not possible to
run fsck while the fs is mounted on any node,

Steve.

> 
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Steven Whitehouse
> Sent: Monday, 29 March 2010 19:41
> To: linux clustering
> Subject: Re: [Linux-cluster] GFS2 - monitoring the rate of Posix lock operations
> 
> On Sun, 2010-03-28 at 02:32 +0000, Jankowski, Chris wrote:
> > Steve,
> > 
> > Q2:
> > >>> Are you sure that the workload isn't causing too many cache invalidations due to sharing files/directories between nodes? This is the most usual cause of poor performance.
> > 
> > The other node is completely idle and kept that way by design. Users are connecting through an IP alias managed by the appplication service. Application administrators also log in through the alias to do their maintenance work. In the case of this particular test I manually listed what is running where. I am very concious of the fact that accesses from multiple nodes invalidate local in-memory caching.
> > 
> > Q3:
> > >>> Have you used the noatime mount option? If you can use it, its highly recommended. Also turn off selinux if that is running on the GFS2 filesystem.
> > 
> > The filesystem is mounted with noatime and no nodiratime options. SELinux is disabled.
> > 
> nodiratime isn't supported, noatime is enough.
> 
> > Q4:
> > >>>Potentially there might be. I don't know enough about the 
> > >>>application to say, but it depends on how the workload can be 
> > >>>arranged,
> > 
> > The application runs on one node at a time.  It has to, as it uses shared memory. The application uses a database of indexed files. There are thousands of them. Also, it uses standard UNIX flile locking and range locking.
> > 
> > What else can I do to minimise the GFS2 locking overhead in this asymetrical configuration.
> > 
> You can use localflocks on each node provided you never access any of the locked files from more than once node at once (which may be true depending on how the failover is designed). Then you will get local fcntl lock performance at the expense of cluster fcntl locks.
> 
> > Q5:
> > Is this the case that when gfs_controld gets to 100% of one coe DPU usage then this is a hard limit on the number of Posix locks taken.  Is there only one gfs_lockd daemon servicng all GFS2 filesystems or are they run on a per filesystems basis?  In the latter case I would have thought that breaking the one filesystem that I have into several may help. Would it not?
> > 
> > Thanks and regards,
> > 
> > Chris
> > 
> Assuming that you have a version in which gfs_controld takes care of the locking (newer GFS2 send the locks via dlm_controld) then yes, that will provide a hard limit on the rate at which locks can be acquired/dropped,
> 
> Steve.
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster