[rhelv6-list] LDAPD dies after update

Prentice Bisbal prentice at ias.edu
Thu Sep 1 14:48:00 UTC 2011


On 09/01/2011 09:40 AM, Götz Reinicke wrote:
> Am 01.09.11 15:08, schrieb Prentice Bisbal:
>> On 09/01/2011 08:36 AM, Götz Reinicke wrote:
>>> Hi,
>>>
>>> recently I updated our ldapd on our RH EL 6.1 to the most recent version
>>> openldap-2.4.23-15.el6_1.1.x86_64 (from 2.4.19-15)
>>>
>>> Since than the deamon died twice in the middle of the night, leaving no
>>> traces to me why.
>>>
>>> The 2.4.19-15-version never died ...
>>>
>>
>> I can't offer any advice as to why it died, but if you haven't done so,
>> I recommend creating a 'watchdog' script that checks to make sure
>> 'slapd' is running, and if it isn't, restart it. Run it from cron every
>> couple of minutes and have it e-mail you every time it needs to restart
>> slapd.
>>
>> This will protect your sanity and avoid phone calls in the middle of the
>> night, as well as automatically collect statistics on how often and when
>> it's dying, which may help your correlate it to another event occuring
>> at the same time which is the root cause. If your watchdog script runs
>> frequently enough, users might not even notice it's down.
> 
> Thanks for your suggestion, may be you could give me a hint on how to
> set this up? In the example config, there is a check for an existing
> .pid-file.
> 
> That would not work in my case, as the pid file is still there, but the
> slapd-process died.
> 


Use pgrep to check the output of ps, like this, and then check the exit
value returned by pgrep. something like this. (Do not copy exactly - not
tested, sure to have syntax errors. Now warranties implied, etc)

#!/bin/bash

pgrep slapd
retval=$?

if [ $retval != 0 ]; then
	# remove PID file
	rm -rf /path/to/pid
	# restart slapd
	service ldap start
	echo "LDAP server restarted at $(date)" | mail -s "LDAP restarted"
root at yourdomain.com
fi

The exit statuses of pgrep are documented in the man page:


 EXIT STATUS
       0      One or more processes matched the criteria.

       1      No processes matched.

       2      Syntax error in the command line.

       3      Fatal error: out of memory etc.


--
PRentice






More information about the rhelv6-list mailing list