[Linux-cluster] lock_dlm: gdlm_cancel messages?

Edward Muller emuller at engineyard.com
Tue Sep 2 22:11:35 UTC 2008


We have a customer who we believe is putting excessive locking  
pressure on one of several gfs volumes (9 total across 5 systems).

They've started to get occasional load spikes that seem to show that  
the gfs is "locking" for a minute or two. Without any action on our  
part the load spikes clear and everything continues as normal.

And we've recently seen the following log entries:

Sep  2 12:57:57 xc88-s00007 kernel: lock_dlm: gdlm_cancel 1,2 flags 0
Sep  2 12:57:57 xc88-s00007 kernel: lock_dlm: gdlm_cancel skip 1,2  
flags 0
Sep  2 12:57:58 xc88-s00007 kernel: lock_dlm: gdlm_cancel 1,2 flags 0
Sep  2 12:57:58 xc88-s00007 kernel: lock_dlm: gdlm_cancel skip 1,2  
flags 0
Sep  2 12:58:40 xc88-s00007 kernel: lock_dlm: gdlm_cancel 1,2 flags 0
Sep  2 12:58:40 xc88-s00007 kernel: lock_dlm: gdlm_cancel skip 1,2  
flags 0
Sep  2 12:58:58 xc88-s00007 kernel: lock_dlm: gdlm_cancel 1,2 flags 0
Sep  2 12:58:58 xc88-s00007 kernel: lock_dlm: gdlm_cancel skip 1,2  
flags 0
Sep  2 12:59:14 xc88-s00007 kernel: lock_dlm: gdlm_cancel 1,2 flags 0
Sep  2 12:59:14 xc88-s00007 kernel: lock_dlm: gdlm_cancel skip 1,2  
flags 0

For all intents and purposes we're running RHCS2 from RHEL 5.2 w/ the  
RHEL 5.2 kernel (2.6.18-92.1.10)

This used to happen to this customer a lot more frequently on RHCS1  
(1.03), but we upgraded them to the above RHCS2 packages and kernel  
and things have been much better.

I'm going to start dumping gfs_tool counters data for the various gfs  
filesystems.

Any advice tracking this down would be useful.

Thanks!


--
Edward Muller
Engine Yard Inc. : Support, Scalability, Reliability
+1.866.518.9273 x209  - Mobile: +1.417.844.2435
IRC: edwardam - XMPP/GTalk: emuller at engineyard.com
Pacific/US




More information about the Linux-cluster mailing list