[Linux-cluster] GFS2 processes getting stuck in WCHAN=dlm_posix_lock

Dustin Henry Offutt dhoffutt at gmail.com
Sat Oct 31 04:04:45 UTC 2009


This sounds like a memory problem from the mail app or OS that runs into the
cluster software. Trace running memory heaps in the dump.

On Fri, Oct 30, 2009 at 6:27 PM, Allen Belletti <allen at isye.gatech.edu>wrote:

> Hi All,
>
> As I've mentioned before, I'm running a two-node clustered mail server on
> GFS2 (with RHEL 5.4)  Nearly all of the time, everything works great.
>  However, going all the way back to GFS1 on RHEL 5.1 (I think it was), I've
> had occasional locking problems that force a reboot of one or both cluster
> nodes.  Lately I've paid closer attention since it's been happening more
> often.
>
> I'll notice the problem when the load average starts rising.  It's always
> tied to "stuck" processes, and I believe always tied to IMAP clients (I'm
> running Dovecot.)  It seems like a file belonging to user "x" (in this case,
> "jforrest" will become locked in some way, such that every IMAP process tied
> that user will get stuck on the same thing.  Over time, as the user keeps
> trying to read that file, more & more processes accumulate.  They're always
> in state "D" (uninterruptible sleep), and always on "dlm_posix_lock"
> according to WCHAN.  The only way I'm able to get out of this state is to
> reboot.  If I let it persist for too long, I/O generally stops entirely.
>
> This certainly seems like it ought to have a definite solution, but I've no
> idea what it is.  I've tried a variety of things using "find" to pinpoint a
> particular file, but everything belonging to the affected user seems just
> fine.  At least, I can read and copy all of the files, and do a stat via ls
> -l.
>
> Is it possible that this is a bug, not within GFS at all, but within
> Dovecot IMAP?
>
> Any thoughts would be appreciated.  It's been getting worse lately and thus
> no fun at all.
>
> Cheers,
> Allen
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20091030/58a6afbd/attachment.htm>


More information about the Linux-cluster mailing list