[Linux-cluster] service stuck in "starting" state

Rick Stevens ricks at nerd.com
Fri Jul 10 23:50:12 UTC 2009


jason at monsterjam.org wrote:
> hey cluster gurus..
> I have a 2 node cluster thats been running without issue for quite a while.. all of a sudden one of the nodes will not 
> completely start the apache webserver service.. it looks like this 
> 
> [root at tf1 ~]# clustat
> Member Status: Quorate
> 
>   Member Name                              Status
>   ------ ----                              ------
>   tf1                                      Online, Local, rgmanager
>   tf2                                      Online, rgmanager
> 
>   Service Name         Owner (Last)                   State         
>   ------- ----         ----- ------                   -----         
>   Apache Service       tf1                            starting        
>   postfix service      tf1                            started         
> [root at tf1 ~]# 
> 
> and I see that the httpd is NOT started. although, if I do 
> /etc/init.d/httpd start
> the service starts without issue.
> 
> grepping for apache and http in the logs, I see this..
> 
> Jul 10 14:32:13 tf1 httpd: httpd shutdown failed
> Jul 10 14:32:52 tf1 httpd: httpd shutdown failed
> Jul 10 14:33:11 tf1 httpd: httpd shutdown failed
> Jul 10 14:33:57 tf1 httpd: Syntax error on line 117 of /etc/httpd/conf.d/ssl.conf:
> Jul 10 14:33:57 tf1 httpd: SSLCertificateFile: file '/etc/httpd/conf/ssl.crt/server.crt' does not exist or is empty
> Jul 10 14:33:57 tf1 httpd: httpd startup failed
> Jul 10 14:34:06 tf1 httpd: Syntax error on line 117 of /etc/httpd/conf.d/ssl.conf:
> Jul 10 14:34:06 tf1 httpd: SSLCertificateFile: file '/etc/httpd/conf/ssl.crt/server.crt' does not exist or is empty
> Jul 10 14:34:06 tf1 httpd: httpd startup failed
> Jul 10 14:34:08 tf1 httpd: httpd shutdown failed
> Jul 10 16:23:33 tf1 clurgmgrd: [6168]: <info> Executing /etc/init.d/httpd stop 
> Jul 10 16:23:34 tf1 httpd: httpd shutdown failed
> Jul 10 16:24:31 tf1 httpd: httpd shutdown failed
> Jul 10 16:24:36 tf1 httpd: httpd shutdown failed
> Jul 10 16:24:41 tf1 httpd: httpd startup succeeded
> Jul 10 18:10:13 tf1 clurgmgrd: [6231]: <info> Executing /etc/init.d/httpd stop 
> Jul 10 18:10:13 tf1 httpd: httpd shutdown failed
> Jul 10 18:22:00 tf1 httpd: httpd startup succeeded
> [root at tf1 log]# grep apache  messages
> Jul 10 04:40:00 tf1 clurgmgrd[6267]: <notice> stop on script "cluster_apache" returned 1 (generic error) 
> Jul 10 10:04:33 tf1 clurgmgrd[6149]: <notice> stop on script "cluster_apache" returned 1 (generic error) 
> Jul 10 14:29:54 tf1 clurgmgrd[6281]: <notice> stop on script "cluster_apache" returned 1 (generic error) 
> Jul 10 16:23:34 tf1 clurgmgrd[6168]: <notice> stop on script "cluster_apache" returned 1 (generic error) 
> Jul 10 18:10:13 tf1 clurgmgrd[6231]: <notice> stop on script "cluster_apache" returned 1 (generic error) 
> [root at tf1 log]# 
> 
> 
> Im guessing its the  stop on script "cluster_apache" returned 1 (generic error)
> but I looked at the /etc/init.d/httpd on tf1 and tf2 and they are both the same size
> 
> [root at tf2 ~]# ls -al /etc/init.d/httpd
> -rwxr-xr-x  1 root root 3201 Jan 30  2007 /etc/init.d/httpd
> 
> [root at tf1 log]# ls -al /etc/init.d/httpd
> -rwxr-xr-x  1 root root 3201 Jan 30  2007 /etc/init.d/httpd
> 
> and the apache service starts/stops just fine on tf2 when the services get failed over to that machine.
> 
> any ideas on what can be wrong?

tf1 is complaining about a bad SSL cert.  The fact that it's complaining
when being started by clurgmgrd but not when started manually indicates
that clurgmgrd is starting it differently (specifying a different
httpd.conf file perhaps?).
----------------------------------------------------------------------
- Rick Stevens, Systems Engineer                      ricks at nerd.com -
- AIM/Skype: therps2        ICQ: 22643734            Yahoo: origrps2 -
-                                                                    -
-     The trouble with troubleshooting is that trouble sometimes     -
-                             shoots back.                           -
----------------------------------------------------------------------




More information about the Linux-cluster mailing list