[Linux-cluster] GFS2 and D state HTTPD processes

Emilio Arjona emilio.ah at gmail.com
Tue Apr 27 11:58:39 UTC 2010


Thanks Ricardo,

We don't want to update the server because it's in production. We will plan
a system update in summer when system's load is low.

In the last incidents there is a new process involved: [delete_workqueu].
Now, it is usually the initiator of the D-state processes lockout. I have
been looking for information about this process but couldn't find out
anything.

Any idea?

Regards :)


2010/4/9 Ricardo Argüello <ricardo at fedoraproject.org>

> Looks like this bug:
>
> GFS2 - probably lost glock call back
> https://bugzilla.redhat.com/show_bug.cgi?id=498976
>
> This is fixed in the kernel included in RHEL 5.5.
> Do a "yum update" to fix it.
>
> Ricardo Arguello
>
> On Tue, Mar 2, 2010 at 6:10 AM, Emilio Arjona <emilio.ah at gmail.com> wrote:
> > Thanks for your response, Steve.
> >
> > 2010/3/2 Steven Whitehouse <swhiteho at redhat.com>:
> >> Hi,
> >>
> >> On Fri, 2010-02-26 at 16:52 +0100, Emilio Arjona wrote:
> >>> Hi,
> >>>
> >>> we are experiencing some problems commented in an old thread:
> >>>
> >>> http://www.mail-archive.com/linux-cluster@redhat.com/msg07091.html
> >>>
> >>> We have 3 clustered servers under Red Hat 5.4 accessing a GFS2
> resource.
> >>>
> >>> fstab options:
> >>> /dev/vg_cluster/lv_cluster /opt/datacluster gfs2
> >>> defaults,noatime,nodiratime,noquota 0 0
> >>>
> >>> GFS options:
> >>> plock_rate_limit="0"
> >>> plock_ownership=1
> >>>
> >>> httpd processes run into D status sometimes and the only solution is
> >>> hard reset the affected server.
> >>>
> >>> Can anyone give me some hints to diagnose the problem?
> >>>
> >>> Thanks :)
> >>>
> >> Can you give me a rough idea of what the actual workload is and how it
> >> is distributed amoung the director(y/ies) ?
> >
> > We had problems with php sessions in the past but we fixed it by
> > configuring php to store the sessions in the database instead of in
> > the GFS filesystem. Now, we're having problems with files and
> > directories in the "data" folder of Moodle LMS.
> >
> > "lsof -p" returned a i/o operation over the same folder in 2/3 nodes,
> > we did a hard reset of these nodes but some hours after the CPU load
> > grew up again, specially in the node that wasn't rebooted. We decided
> > to reboot (vía ssh) this node, then the CPU load went down to normal
> > values in all nodes.
> >
> > I don't think the system's load is high enough to produce concurrent
> > access problems. It's more likely to be some misconfiguration, in
> > fact, we changed some GFS2 options to non default values to increase
> > performance (
> http://www.linuxdynasty.org/howto-increase-gfs2-performance-in-a-cluster.html
> ).
> >
> >>
> >> This is often down to contention on glocks (one per inode) and maybe
> >> because there is a process of processes writing a file or directory
> >> which is in use (either read-only or writable) by other processes.
> >>
> >> If you are using php, then you might have to strace it to find out what
> >> it is really doing,
> >
> > Ok, we will try to strace the D processes and post the results. Hope
> > we find something!!
> >
> >>
> >> Steve.
> >>
> >>> --
> >>>
> >>> Emilio Arjona.
> >>>
> >>> --
> >>> Linux-cluster mailing list
> >>> Linux-cluster at redhat.com
> >>> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >>
> >> --
> >> Linux-cluster mailing list
> >> Linux-cluster at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >
> >
> >
> > --
> > Emilio Arjona.
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



-- 
*******************************************
Emilio Arjona Heredia
Centro de Enseñanzas Virtuales de la Universidad de Granada
C/ Real de Cartuja 36-38
http://cevug.ugr.es
Tlfno.: 958-241000 ext. 20206
*******************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100427/28ccf00b/attachment.htm>


More information about the Linux-cluster mailing list