[Linux-cluster] GFS load average and locking

Wendy Cheng wcheng at redhat.com
Thu Mar 9 20:32:56 UTC 2006


Marc Grimme wrote:

>Although the strace does not show the output I know of the problem description 
>sounds like a deja vu.
>We had loads of problems with having sessions on GFS and httpd s ending up 
>with "D" state for some time (at high load times we had ServerLimit httpd in 
>D per node which ended up in the service not being available). 
>As I posted already we think it is because of the "bad" locking of sessions 
>with php (as php sessions are on gfs and strace showed those timeouts with 
>the session files). When you issue a "session_start" or what ever that 
>function is called, the session_file is locked via an flock syscall. That 
>lock is held until you end the session which is implicitly done when the tcp 
>connection to the client is ended. Now comes another http process (on 
>whatever node) and calls a "session start" and trys an flock on that session 
>while another process already holds that lock. The process might end up in 
>the seen timeouts (30-60secs) which (as far as I remember relates to the 
>timeout of the tcp connection defined in the httpd.conf or some timeout in 
>the php.ini) - there is an explanation on this but I cannot rember ;-) ). 
>Nevertheless in our scenario the problems were the "bad" session handling by 
>php. We have made a patch for the phplib where you can disable the locking, 
>or just implicitly do locking and therefore keep consitency while session 
>data is read or written. We could make apache work as expected and now we 
>don't see any "D" process anymore since a year.
>Oh yes the patch can be found at
>www.opensharedroot.org in the download section.
>
>Besides: You will never encounter this on a localfilesystem or nfs (as nfs 
>ignores flocks). As nfs does not support flocks and silently ignores them.
>
>  
>
Hi,

This does look like the problem description sent out by savvis.net folks 
during our off-list email exchanges. However, without actually looking 
at the thread traces (when they are in D state), it is difficult to be 
sure. One way to obtain the exact thread trace is using "crash" tool to 
do a back trace (e.g. "bt <pid>", you need kernel debuginfo RPM though). 
Britt, do let us know whether this php patch helps and/or using crash 
command to obtain the thread trace output.

On the other hand, I don't understand how a local (non-cluster) 
filesystem can be immune from this problem ?

-- Wendy




More information about the Linux-cluster mailing list