[Freeipa-users] ipa replica failed PR_DeleteSemaphore

Andrew E. Bruno aebruno2 at buffalo.edu
Wed Mar 9 15:37:05 UTC 2016


On Wed, Mar 09, 2016 at 04:13:28PM +0100, Ludwig Krispenz wrote:
> 
> On 03/09/2016 03:46 PM, Andrew E. Bruno wrote:
> >Hello,
> >
> >We had a replica fail today with:
> >
> >[09/Mar/2016:09:39:59 -0500] NSMMReplicationPlugin - changelog program - _cl5NewDBFile: PR_DeleteSemaphore: /var/lib/dirsrv/slapd-CBLS-CCR-BUFFALO-EDU/cldb/e909b405-2cb811e5-ac0b8f7e-e0b1a377.sema; NSPR error - -5943
> the nspr error means:
> /* Cannot create or rename a filename that already exists */
> #define PR_FILE_EXISTS_ERROR (-5943L)
> 
> could you check if the file exists and if there is a permission problem for
> the dirsrv user to recreate it ?

Looks like the file exists:


# ls -alh /var/lib/dirsrv/slapd-CBLS-CCR-BUFFALO-EDU/cldb/e909b405-2cb811e5-ac0b8f7e-e0b1a377.sema
-rw-r--r-- 1 dirsrv dirsrv 0 Mar  9 09:39 /var/lib/dirsrv/slapd-CBLS-CCR-BUFFALO-EDU/cldb/e909b405-2cb811e5-ac0b8f7e-e0b1a377.sema

> if the process hangs, could you get a pstack from the process ?

We did a systemctl restart ipa.. which failed.. but looks like the dirsrv is still running. The logs are now filling up with:

[09/Mar/2016:10:23:10 -0500] DSRetroclPlugin - delete_changerecord: could not delete change record 11272988 (rc: 32)
[09/Mar/2016:10:23:10 -0500] DSRetroclPlugin - delete_changerecord: could not delete change record 11272989 (rc: 32)
[09/Mar/2016:10:23:10 -0500] DSRetroclPlugin - delete_changerecord: could not delete change record 11272990 (rc: 32)

However, if I do a kinit:

kinit: Cannot contact any KDC for realm 'CBLS.CCR.BUFFALO.EDU' while getting initial credentials

Should I be concerned that this will end up corrupting the other replicas? Should we just let this finish?

We have 3 replicas in our system. Looks like we just lost a second one. This
feels very similar to the error we hit a while back:

https://www.redhat.com/archives/freeipa-users/2015-September/msg00006.html

We're seeing the exact same behavior.. access logs are filling up with:

[09/Mar/2016:10:26:03 -0500] conn=6877203 fd=4003 slot=4003 connection from 10.113.14.131 to 10.113.14.131
[09/Mar/2016:10:26:03 -0500] conn=6877204 fd=4004 slot=4004 connection from 10.116.28.10 to 10.113.14.131
[09/Mar/2016:10:26:09 -0500] conn=6877205 fd=4005 slot=4005 connection from 10.113.14.131 to 10.113.14.131
[09/Mar/2016:10:26:15 -0500] conn=6877206 fd=4006 slot=4006 connection from 10.113.14.131 to 10.113.14.131
[09/Mar/2016:10:26:21 -0500] conn=6877207 fd=4007 slot=4007 connection from 10.113.14.131 to 10.113.14.131
[09/Mar/2016:10:26:27 -0500] conn=6877208 fd=4008 slot=4008 connection from 10.113.14.131 to 10.113.14.131
[09/Mar/2016:10:26:28 -0500] conn=6877209 fd=4009 slot=4009 connection from 10.116.28.33 to 10.113.14.131
[09/Mar/2016:10:26:30 -0500] conn=6877210 fd=4010 slot=4010 connection from 10.116.28.23 to 10.113.14.131
[09/Mar/2016:10:26:33 -0500] conn=6877211 fd=4011 slot=4011 connection from 10.113.14.131 to 10.113.14.131

The ns-slapd proccess is showing this from top:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
24951 dirsrv    20   0 15.477g 0.013t 6.067g S   0.0 27.3 101566:54 ns-slapd


I'd be happy to provide a pstack but can't seem to get the correct debuginfo
packages installed.. we're running centos7 and  389-ds-base 1.3.3.1. We haven't
upgraded to 1.3.4.0. How can I get the debuginfo packages installed for that
specific version.

Thanks!

--Andrew




More information about the Freeipa-users mailing list