[Linux-cluster] GFS2 processes getting stuck in WCHAN=dlm_posix_lock

Allen Belletti allen at isye.gatech.edu
Mon Nov 2 20:02:43 UTC 2009


Hi Dave,

On 11/02/2009 12:11 PM, David Teigland wrote:
> On Fri, Oct 30, 2009 at 07:27:23PM -0400, Allen Belletti wrote:
>    
>> I'll notice the problem when the load average starts rising.  It's
>> always tied to "stuck" processes, and I believe always tied to IMAP
>> clients (I'm running Dovecot.)  It seems like a file belonging to user
>> "x" (in this case, "jforrest" will become locked in some way, such that
>> every IMAP process tied that user will get stuck on the same thing.
>> Over time, as the user keeps trying to read that file, more&  more
>> processes accumulate.  They're always in state "D" (uninterruptible
>> sleep), and always on "dlm_posix_lock" according to WCHAN.  The only way
>> I'm able to get out of this state is to reboot.  If I let it persist for
>> too long, I/O generally stops entirely.
>>      
> Next time, try to collect all the following information as soon as you can
> after the first process gets stuck:
>
> - ps showing pid of stuck/"D" process(es) and WCHAN
> - which file they are stuck trying to lock
>    (and the inode number of it, you may need to wait until after the
>     reboot to use ls -li on the file to get the inode number)
> - group_tool dump plocks<fsname>  from all the nodes
>
> I'm guessing that dovecot does some "unusual" combinations of locking,
> closing, renaming, unlinking files.  Those combinations are especially
> prone to races and bugs that cause posix lock state to get off.
>    
I'll collect all of this as soon as I catch the problem in action 
again.  Do you know how I might go about determine which file is 
involved?  I can find the user because it's associated with the 
particular "imap" process, but haven't been able to figure out what's 
being locked.

Thanks,
Allen

-- 
Allen Belletti
allen at isye.gatech.edu                             404-894-6221 Phone
Industrial and Systems Engineering                404-385-2988 Fax
Georgia Institute of Technology




More information about the Linux-cluster mailing list