[389-users] 389 unusable on F11?

Fri Sep 11 17:14:01 UTC 2009

On 9/11/2009 12:43 PM, Noriko Hosoi wrote:
> On 09/10/2009 07:46 PM, Kevin Bowling wrote:
>> Hi,
>>
>> I have been running FDS/389 on a F11 xen DomU for several months.  I 
>> use it as the backend for UNIX username/passwords and also for 
>> redMine (a Ruby on Rails bug tracker) for http://www.gnucapplus.org/.
>>
>> This VM would regularly lock up every week or so when 389 was still 
>> called FDS.  I've since upgraded to 389 by issuing 'yum upgrade' as 
>> well as running the 'setup-...-.pl -u' script and now it barely goes 
>> a day before crashing.  When ldap crashes, the whole box basically 
>> becomes unresponsive.
>>
>> I left the Xen hardware console open to see what was up and the only 
>> thing I could conclude was that 389 was crashing (if I issued a 
>> service start it came back to life).  Doing anything like a top or ls 
>> will completely kill the box.  Likewise, the logs show nothing at or 
>> before the time of crash.  I suspected too few file descriptors but 
>> changing that to a very high number had no impact.
>>
>> I was about to do a rip and replace with OpenLDAP which I use very 
>> sucesessfully for our corporate systems but figured I ought to see if 
>> anyone here can help or if I can submit any kind of meaningful bug 
>> report first.  I assume I will need to run 389's slapd without 
>> daemonizing it and hope it spits something useful out to stderr.  Any 
>> advice here would be greatly appreciated, as would any success 
>> stories of using 389 on F11.
> Hello Kevin,
>
> You specified the platform "F11 xen DomU".  Did you have a chance to 
> run the 389 server on any other platforms?  I'm wondering if the crash 
> is observed only on the specific platform or not.  Is the server 
> running on the 64-bit machine or 32-bit?
>
> If you start the server with "-d 1" option, the server will run as the 
> trace mode.  (E.g., /usr/lib[64]/dirsrv/slapd-YOURID/start-slapd -d 1)
>
> I'm afraid it might be a memory leak.  When you restart the 389 
> server, could you check the size of ns-slapd some time like every hour 
> and see if the server size keeps growing or stops?  Also, the server 
> quits if it fails to write to the errors log.  If it happens, it's 
> logged in the system log.  Does the messages file on the system  
> happen to have some logs related to the 389 server?
>
> Thanks,
> --noriko
>>
>> I'm not subscribed to the list so please CC.
>>
>> Regards,
>>
>> Kevin Bowing
>>
>> -- 
>> 389 users mailing list
>> 389-users at redhat.com
>> https://www.redhat.com/mailman/listinfo/fedora-directory-users
>

I captured some output while running in trace, see the end of this 
message.  The system is 64-bit, I have not run on any other boxes.  A 
cursory look with top showed only 10MB or so RSS memory.

Regards,
Kevin

[11/Sep/2009:09:58:44 -0700] - => id2entry( 48 )
[11/Sep/2009:09:58:44 -0700] - <= id2entry 7f025401f5a0 (cache)
[11/Sep/2009:09:58:44 -0700] - => id2entry( 50 )
[11/Sep/2009:09:58:44 -0700] - <= id2entry 7f0254021190 (cache)
[11/Sep/2009:09:58:44 -0700] - => slapi_reslimit_get_integer_limit() 
conn=0xa856beb0, handle=3
[11/Sep/2009:09:58:44 -0700] - <= slapi_reslimit_get_integer_limit() 
returning NO VALUE
[11/Sep/2009:09:58:44 -0700] - => slapi_reslimit_get_integer_limit() 
conn=0xa856bc60, handle=3
[11/Sep/2009:09:58:44 -0700] - <= slapi_reslimit_get_integer_limit() 
returning NO VALUE
[11/Sep/2009:09:58:44 -0700] - => slapi_reslimit_get_integer_limit() 
conn=0xa856bd88, handle=3
[11/Sep/2009:09:58:44 -0700] - <= slapi_reslimit_get_integer_limit() 
returning NO VALUE
[11/Sep/2009:09:58:44 -0700] - => slapi_reslimit_get_integer_limit() 
conn=0xa856bb38, handle=3
[11/Sep/2009:09:58:44 -0700] - <= slapi_reslimit_get_integer_limit() 
returning NO VALUE
[11/Sep/2009:09:58:44 -0700] - => slapi_reslimit_get_integer_limit() 
conn=0xa856ba10, handle=3
[11/Sep/2009:09:58:44 -0700] - <= slapi_reslimit_get_integer_limit() 
returning NO VALUE
[11/Sep/2009:09:58:44 -0700] - => send_ldap_result 0::
[11/Sep/2009:09:58:44 -0700] - <= send_ldap_result
[11/Sep/2009:09:58:50 -0700] - ldbm backend flushing
[11/Sep/2009:09:58:50 -0700] - ldbm backend done flushing
[11/Sep/2009:09:58:50 -0700] - ldbm backend flushing
[11/Sep/2009:09:58:50 -0700] - ldbm backend done flushing
[11/Sep/2009:09:59:20 -0700] - ldbm backend flushing
[11/Sep/2009:09:59:20 -0700] - ldbm backend done flushing
[11/Sep/2009:09:59:20 -0700] - ldbm backend flushing
[11/Sep/2009:09:59:20 -0700] - ldbm backend done flushing
[11/Sep/2009:09:59:50 -0700] - ldbm backend flushing
[11/Sep/2009:09:59:50 -0700] - ldbm backend done flushing
[11/Sep/2009:09:59:50 -0700] - ldbm backend flushing
[11/Sep/2009:09:59:50 -0700] - ldbm backend done flushing
[11/Sep/2009:10:00:20 -0700] - ldbm backend flushing
[11/Sep/2009:10:00:20 -0700] - ldbm backend done flushing
[11/Sep/2009:10:00:20 -0700] - ldbm backend flushing
[11/Sep/2009:10:00:20 -0700] - ldbm backend done flushing
[11/Sep/2009:10:00:50 -0700] - ldbm backend flushing
[11/Sep/2009:10:01:03 -0700] - ldbm backend done flushing
[11/Sep/2009:10:01:03 -0700] - ldbm backend flushing
[11/Sep/2009:10:01:04 -0700] - ldbm backend done flushing
[11/Sep/2009:10:01:35 -0700] - ldbm backend flushing
[11/Sep/2009:10:01:39 -0700] - ldbm backend done flushing
[11/Sep/2009:10:01:39 -0700] - ldbm backend flushing
[11/Sep/2009:10:01:39 -0700] - ldbm backend done flushing
[11/Sep/2009:10:01:39 -0700] - ldbm backend flushing
[11/Sep/2009:10:01:39 -0700] - ldbm backend done flushing
[11/Sep/2009:10:01:39 -0700] - ldbm backend flushing
[11/Sep/2009:10:01:39 -0700] - ldbm backend done flushing
[11/Sep/2009:10:01:39 -0700] - slapd shutting down - signaling operation 
threads
[11/Sep/2009:10:01:40 -0700] - slapd shutting down - waiting for 30 
threads to terminate
[11/Sep/2009:10:01:40 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:40 -0700] - slapd shutting down - waiting for 29 
threads to terminate
[11/Sep/2009:10:01:40 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:40 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:40 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:40 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:40 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:40 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:40 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:40 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:40 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:40 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - op_thread received shutdown signal
[11/Sep/2009:10:01:41 -0700] - slapd shutting down - waiting for 28 
threads to terminate
[11/Sep/2009:10:01:41 -0700] - slapd shutting down - closing down 
internal subsystems and plugins
[11/Sep/2009:10:01:41 -0700] - slapd shutting down - waiting for 
backends to close down
[11/Sep/2009:10:01:42 -0700] - => slapi_control_present (looking for 
1.3.6.1.4.1.42.2.27.8.5.1)
[11/Sep/2009:10:01:42 -0700] - <= slapi_control_present 0 (NO CONTROLS)
[11/Sep/2009:10:01:42 -0700] - modify_update_last_modified_attr
[11/Sep/2009:10:01:42 -0700] - Calling plugin 'Distributed Numeric 
Assignment internal preop plugin' #0 type 421
[11/Sep/2009:10:01:42 -0700] dna-plugin - --> dna_pre_op
[11/Sep/2009:10:01:42 -0700] dna-plugin - <-- dna_pre_op
[11/Sep/2009:10:01:42 -0700] - Calling plugin 'Legacy replication 
internal preoperation plugin' #1 type 421
[11/Sep/2009:10:01:42 -0700] - Calling plugin 'Multimaster replication 
internal preoperation plugin' #2 type 421
[11/Sep/2009:10:01:42 -0700] - => entry_apply_mods
[11/Sep/2009:10:01:42 -0700] - <= entry_apply_mods 0
[11/Sep/2009:10:01:42 -0700] - => send_ldap_result 0::
[11/Sep/2009:10:01:42 -0700] - <= send_ldap_result
[11/Sep/2009:10:01:42 -0700] - ps_service_persistent_searches: entry 
"cn=uniqueid generator,cn=config" not enqueued on any persistent search 
lists
[11/Sep/2009:10:01:42 -0700] - Calling plugin 'Class of Service 
internalpostoperation plugin' #0 type 521
[11/Sep/2009:10:01:42 -0700] - --> cos_post_op
[11/Sep/2009:10:01:42 -0700] - --> cos_cache_change_notify
[11/Sep/2009:10:01:42 -0700] - --> cos_cache_template_index_bsearch
[11/Sep/2009:10:01:42 -0700] - --> cos_cache_getref
[11/Sep/2009:10:01:42 -0700] - <-- cos_cache_getref
[11/Sep/2009:10:01:42 -0700] - <-- cos_cache_template_index_bsearch
[11/Sep/2009:10:01:42 -0700] - <-- cos_cache_change_notify
[11/Sep/2009:10:01:42 -0700] - <-- cos_post_op
[11/Sep/2009:10:01:42 -0700] - Calling plugin 'Legacy replication 
internal postoperation plugin' #1 type 521
[11/Sep/2009:10:01:43 -0700] - Calling plugin 'Multimaster replication 
internal postoperation plugin' #2 type 521
[11/Sep/2009:10:01:43 -0700] - Calling plugin 'Retrocl internal 
postoperation plugin' #3 type 521
not applying change if not logging
[11/Sep/2009:10:01:43 -0700] - Calling plugin 'Roles 
internalpostoperation plugin' #4 type 521
[11/Sep/2009:10:01:43 -0700] - Calling plugin 'Legacy Replication 
Plugin' #0 type 210
[11/Sep/2009:10:01:43 -0700] - Calling plugin 'Roles Plugin' #0 type 210
[11/Sep/2009:10:01:43 -0700] - Calling plugin 'Multimaster Replication 
Plugin' #0 type 210
[11/Sep/2009:10:01:43 -0700] - Calling plugin 'HTTP Client' #0 type 210
[11/Sep/2009:10:01:43 -0700] - Calling plugin 'Class of Service' #0 type 210
[11/Sep/2009:10:01:43 -0700] - --> cos_close
[11/Sep/2009:10:01:43 -0700] - --> cos_cache_stop
[11/Sep/2009:10:01:43 -0700] - <-- cos_cache_wait_on_change thread exit
[11/Sep/2009:10:01:43 -0700] - --> cos_cache_release
[11/Sep/2009:10:01:43 -0700] - <-- cos_cache_release
[11/Sep/2009:10:01:43 -0700] - <-- cos_cache_stop
[11/Sep/2009:10:01:43 -0700] - <-- cos_close
[11/Sep/2009:10:01:43 -0700] - Calling plugin 'ACL Plugin' #0 type 210
[11/Sep/2009:10:01:43 -0700] - Calling plugin 'Views' #0 type 210
[11/Sep/2009:10:01:43 -0700] views-plugin - --> views_close
[11/Sep/2009:10:01:43 -0700] views-plugin - --> views_cache_free
[11/Sep/2009:10:01:43 -0700] views-plugin - <-- views_cache_free
[11/Sep/2009:10:01:43 -0700] views-plugin - <-- views_close
[11/Sep/2009:10:01:43 -0700] - Calling plugin 'State Change Plugin' #0 
type 210
[11/Sep/2009:10:01:43 -0700] statechange-plugin - --> statechange_close
[11/Sep/2009:10:01:43 -0700] statechange-plugin - <-- statechange_close
[11/Sep/2009:10:01:43 -0700] - Calling plugin 'ldbm database' #0 type 210
[11/Sep/2009:10:01:43 -0700] - ldbm backend syncing
[11/Sep/2009:10:01:43 -0700] - Waiting for 4 database threads to stop
[11/Sep/2009:10:01:43 -0700] - Leaving deadlock_threadmain
[11/Sep/2009:10:01:44 -0700] - Leaving checkpoint_threadmain before 
checkpoint
[11/Sep/2009:10:01:44 -0700] - Checkpointing database ...
[11/Sep/2009:10:01:44 -0700] - Leaving checkpoint_threadmain
[11/Sep/2009:10:01:44 -0700] - Leaving trickle_threadmain priv
[11/Sep/2009:10:01:44 -0700] - Leaving perf_threadmain
[11/Sep/2009:10:01:45 -0700] - All database threads now stopped
[11/Sep/2009:10:01:45 -0700] - ldbm backend done syncing
[11/Sep/2009:10:01:45 -0700] - Calling plugin 'chaining database' #0 
type 210
[11/Sep/2009:10:01:45 -0700] - Removed [1] entries from the dse tree.
[11/Sep/2009:10:01:45 -0700] - Removed [166] entries from the dse tree.
[11/Sep/2009:10:01:45 -0700] - ldbm backend cleaning up
[11/Sep/2009:10:01:45 -0700] - ldbm backend cleaning up
[11/Sep/2009:10:01:45 -0700] - slapd shutting down - backends closed down
[11/Sep/2009:10:01:45 -0700] - => reslimit_update_from_entry() 
conn=0xa856ba10, entry=0x0
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 0 (based on nsLookThroughLimit)
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 1 (based on nsSizeLimit)
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 2 (based on nsTimeLimit)
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 3 (based on nsIdleTimeout)
[11/Sep/2009:10:01:45 -0700] - <= reslimit_update_from_entry() returning 
status 0
[11/Sep/2009:10:01:45 -0700] - => reslimit_update_from_entry() 
conn=0xa856bb38, entry=0x0
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 0 (based on nsLookThroughLimit)
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 1 (based on nsSizeLimit)
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 2 (based on nsTimeLimit)
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 3 (based on nsIdleTimeout)
[11/Sep/2009:10:01:45 -0700] - <= reslimit_update_from_entry() returning 
status 0
[11/Sep/2009:10:01:45 -0700] - => reslimit_update_from_entry() 
conn=0xa856bd88, entry=0x0
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 0 (based on nsLookThroughLimit)
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 1 (based on nsSizeLimit)
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 2 (based on nsTimeLimit)
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 3 (based on nsIdleTimeout)
[11/Sep/2009:10:01:45 -0700] - <= reslimit_update_from_entry() returning 
status 0
[11/Sep/2009:10:01:45 -0700] - => reslimit_update_from_entry() 
conn=0xa856beb0, entry=0x0
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 0 (based on nsLookThroughLimit)
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 1 (based on nsSizeLimit)
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 2 (based on nsTimeLimit)
[11/Sep/2009:10:01:45 -0700] - reslimit_update_from_entry(): setting 
limit for handle 3 (based on nsIdleTimeout)
[11/Sep/2009:10:01:45 -0700] - <= reslimit_update_from_entry() returning 
status 0
[11/Sep/2009:10:01:45 -0700] - slapd stopped.