[Freeipa-users] IPA Replica Issues (Total update abortedLDAP error: Can't contact LDAP server)

Rich Megginson rmeggins at redhat.com
Sat Apr 5 01:16:28 UTC 2014


On 04/03/2014 10:25 PM, Nevada Sanchez wrote:
> I followed the instructions that would give me a core dump, and for 
> some reason, I don't see one in /var/log/dirsrv/slapd-EXAMPLE-COM/, 
> even though I still see the Disorderly shutdown still shows up in the 
> logs.

Hmm - check again - it should produce a core file

grep -i segfault /var/log/messages

> I know that when I explicitly request those attributes, I get "-1 
> Total update abortedLDAP error: Can't contact LDAP server" for 
> nds5ReplicaLastInitStatus (see below). Access logs stop completely on 
> the replica after the time that you mentioned.

Hmm - looks like a bug.  Please open a ticket.

>
> ======================================================
> [root at ipa2 ipaserver]# ldapsearch  ldaps://ipa.example.com:636 
> <http://ipa.example.com:636/> -D 'cn=Directory Manager' -w ##### -b 
> 'cn=meToipa2.example.com 
> <http://metoipa2.example.com/>,cn=replica,cn=dc\=example\,dc\=com,cn=mapping 
> tree,cn=config' '(objectClass=*)' -s base nsds5ReplicaLastInitStart 
> nsds5replicaUpdateInProgress nsds5ReplicaLastInitStatus cn 
> nsds5BeginReplicaRefresh nsds5ReplicaLastInitEnd
> # extended LDIF
> #
> # LDAPv3
> # base <cn=meToipa2.example.com 
> <http://metoipa2.example.com/>,cn=replica,cn=dc\=example\,dc\=com,cn=mapping 
> tree,cn=config> with scope baseObject
> # filter: (objectclass=*)
> # requesting: ldaps://ipa.example.com:636 
> <http://ipa.example.com:636/> (objectClass=*) 
> nsds5ReplicaLastInitStart nsds5replicaUpdateInProgress 
> nsds5ReplicaLastInitStatus cn nsds5BeginReplicaRefresh 
> nsds5ReplicaLastInitEnd
> #
>
> # meToipa2.example.com <http://metoipa2.example.com/>, replica, 
> dc\3Dexample\2Cdc\3Dcom,
> mapping tree, config
> dn: cn=meToipa2.example.com 
> <http://metoipa2.example.com/>,cn=replica,cn=dc\3Dexample\2Cd
>  c\3Dcom,cn=mapping tree,cn=config
> nsds5ReplicaLastInitStart: 20140401092800Z
> nsds5replicaUpdateInProgress: FALSE
> nsds5ReplicaLastInitStatus: -1 Total update abortedLDAP error: Can't 
> contact L
>  DAP server
> cn: meToipa2.example.com <http://metoipa2.example.com/>
> nsds5ReplicaLastInitEnd: 20140401092804Z
>
> # search result
> search: 2
> result: 0 Success
>
> # numResponses: 2
> # numEntries: 1
>
>
> On Thu, Apr 3, 2014 at 6:32 PM, Rich Megginson <rmeggins at redhat.com 
> <mailto:rmeggins at redhat.com>> wrote:
>
>     On 04/03/2014 03:46 PM, Nevada Sanchez wrote:
>>     Okay, I updated the gist and extended some of the logs
>>     (ipa2-errors does stop at 20:50:21). I'll follow up when I have
>>     the debug stuff in place.
>>
>>     https://gist.github.com/nevsan/8b6f78d7396963dc5f70
>
>     Another strange thing - it looks as if the initial replica init
>     completes successfully.
>
>     [02/Apr/2014:20:50:18 +0000] NSMMReplicationPlugin - Beginning
>     total update of replica "agmt="cn=meToipa2.example.com
>     <http://meToipa2.example.com>" (ipa2:389)".
>
>     On the replica:
>
>     [02/Apr/2014:20:50:18 +0000] NSMMReplicationPlugin -
>     multimaster_be_state_change: replica dc=example,dc=com is going
>     offline; disabling replication
>     [02/Apr/2014:20:50:18 +0000] - WARNING: Import is running with
>     nsslapd-db-private-import-mem on; No other process is allowed to
>     access the database
>     [02/Apr/2014:20:50:21 +0000] - import userRoot: Workers finished;
>     cleaning up...
>     [02/Apr/2014:20:50:21 +0000] - import userRoot: Workers cleaned up.
>     [02/Apr/2014:20:50:21 +0000] - import userRoot: Indexing complete.
>     Post-processing...
>     [02/Apr/2014:20:50:21 +0000] - import userRoot: Generating
>     numSubordinates complete.
>     [02/Apr/2014:20:50:21 +0000] - import userRoot: Flushing caches...
>     [02/Apr/2014:20:50:21 +0000] - import userRoot: Closing files...
>     [02/Apr/2014:20:50:21 +0000] - import userRoot: Import complete.
>     Processed 453 entries in 3 seconds. (151.00 entries/sec)
>     [02/Apr/2014:20:50:21 +0000] NSMMReplicationPlugin -
>     multimaster_be_state_change: replica dc=example,dc=com is coming
>     online; enabling replication
>
>     On the master, access log:
>
>     [02/Apr/2014:20:50:17 +0000] conn=1365 op=15 MOD
>     dn="cn=meToipa2.example.com
>     <http://meToipa2.example.com>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
>     tree,cn=config"
>
>     This is the operation that triggers the replica init. Then
>     ipa-replica-install polls for agreement status:
>     [02/Apr/2014:20:50:19 +0000] conn=1365 op=16 SRCH
>     base="cn=meToipa2.example.com
>     <http://meToipa2.example.com>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
>     tree,cn=config" scope=0 filter="(objectClass=*)"
>     attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress
>     nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh
>     nsds5replicaLastInitEnd"
>     [02/Apr/2014:20:50:19 +0000] conn=1365 op=16 RESULT err=0 tag=101
>     nentries=1 etime=0
>     [02/Apr/2014:20:50:20 +0000] conn=1365 op=17 SRCH
>     base="cn=meToipa2.example.com
>     <http://meToipa2.example.com>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
>     tree,cn=config" scope=0 filter="(objectClass=*)"
>     attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress
>     nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh
>     nsds5replicaLastInitEnd"
>     [02/Apr/2014:20:50:20 +0000] conn=1365 op=17 RESULT err=0 tag=101
>     nentries=1 etime=0
>     [02/Apr/2014:20:50:21 +0000] conn=1365 op=18 SRCH
>     base="cn=meToipa2.example.com
>     <http://meToipa2.example.com>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
>     tree,cn=config" scope=0 filter="(objectClass=*)"
>     attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress
>     nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh
>     nsds5replicaLastInitEnd"
>     [02/Apr/2014:20:50:21 +0000] conn=1365 op=18 RESULT err=0 tag=101
>     nentries=1 etime=0
>     [02/Apr/2014:20:50:22 +0000] conn=1365 op=19 SRCH
>     base="cn=meToipa2.example.com
>     <http://meToipa2.example.com>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
>     tree,cn=config" scope=0 filter="(objectClass=*)"
>     attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress
>     nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh
>     nsds5replicaLastInitEnd"
>     [02/Apr/2014:20:50:22 +0000] conn=1365 op=19 RESULT err=0 tag=101
>     nentries=1 etime=1
>
>     Something happens here.  The replica init is done, according to
>     the replica error log.  We don't have the replica access log from
>     around this time to see exactly when the connection was closed,
>     but looking at the ipa code, it would appear that ipa did not see
>     a status of "Total update succeeded".  Not sure why the master
>     would not have reported that, unless there was some problem
>     getting back the status from the replica.
>
>     [02/Apr/2014:20:50:22 +0000] conn=1365 op=20 UNBIND
>     [02/Apr/2014:20:50:22 +0000] conn=1365 op=20 fd=114 closed - U1
>
>     Then ipa-replica-install closes the connection and reports the error.
>
>
>>
>>
>>     On Thu, Apr 3, 2014 at 10:38 AM, Rich Megginson
>>     <rmeggins at redhat.com <mailto:rmeggins at redhat.com>> wrote:
>>
>>         On 04/02/2014 09:22 PM, Nevada Sanchez wrote:
>>>         Okay. Updated the gist with the additional logs:
>>>         https://gist.github.com/nevsan/8b6f78d7396963dc5f70
>>>
>>>
>>
>>         1) Dirsrv is crashing:
>>         [02/Apr/2014:20:49:53 +0000] - 389-Directory/1.3.1.22.a1
>>         B2014.073.1751 starting up
>>         [02/Apr/2014:20:49:54 +0000] - Db home directory is not set.
>>         Possibly nsslapd-directory (optionally
>>         nsslapd-db-home-directory) is missing in the config file.
>>         [02/Apr/2014:20:49:54 +0000] - I'm resizing my cache
>>         now...cache was 710029312 and is now 8000000
>>         [02/Apr/2014:20:49:54 +0000] - 389-Directory/1.3.1.22.a1
>>         B2014.073.1751 starting up
>>         [02/Apr/2014:20:49:54 +0000] - Detected Disorderly Shutdown
>>         last time Directory Server was running, recovering database.
>>         [02/Apr/2014:20:49:55 +0000] - slapd started. Listening on
>>         All Interfaces port 389 for LDAP requests
>>
>>         Please use the instructions at
>>         http://port389.org/wiki/FAQ#Debugging_Crashes to get a core
>>         dump and stack trace.
>>
>>         2) The first occurrence of the connection error is at
>>         [02/Apr/2014:20:52:38 +0000] but there isn't anything in the
>>         consumer error log after [02/Apr/2014:20:50:21 +0000] and in
>>         the consumer access log after [02/Apr/2014:20:50:22 +0000]
>>
>>
>>>         On Wed, Apr 2, 2014 at 9:38 PM, Rich Megginson
>>>         <rmeggins at redhat.com <mailto:rmeggins at redhat.com>> wrote:
>>>
>>>             On 04/02/2014 03:01 PM, Nevada Sanchez wrote:
>>>>             Okay, I ran it with debug on. The output is quite
>>>>             large. I'm not sure what the etiquette is for posting
>>>>             large logs, so I threw it on gist here:
>>>>             https://gist.githubusercontent.com/nevsan/8b6f78d7396963dc5f70/raw/b76b3c3acce4f12d292d680f4c1dab39c05888d5/gistfile1.txt
>>>>             <http://gist.githubusercontent.com/nevsan/8b6f78d7396963dc5f70/raw/b76b3c3acce4f12d292d680f4c1dab39c05888d5/gistfile1.txt>
>>>>
>>>>
>>>>             Let me know if I should copy it into the thread instead.
>>>
>>>             Ok.  Now can you post excerpts from the dirsrv errors
>>>             log from both the master replica and the replica from
>>>             around the time of the failure?
>>>
>>>
>>>>
>>>>
>>>>             On Wed, Apr 2, 2014 at 1:49 PM, Rich Megginson
>>>>             <rmeggins at redhat.com <mailto:rmeggins at redhat.com>> wrote:
>>>>
>>>>                 On 04/02/2014 11:45 AM, Nevada Sanchez wrote:
>>>>>                 My apologies. I mistakenly ran the failing
>>>>>                 ldapsearch from an unpriviliged user (couldn't
>>>>>                 read slapd-EXAMPLE-COM directory). Running as
>>>>>                 root, it now works just fine (same result as the
>>>>>                 one that worked). SSL seems to not be the issue.
>>>>>                 Also, I haven't change the SSL certs since I first
>>>>>                 set up the master.
>>>>>
>>>>>                 I have been doing the replica side things from
>>>>>                 scratch (even so far as starting with a new
>>>>>                 machine). For the master side, I have just been
>>>>>                 re-preparing the replica. I hope I don't have to
>>>>>                 start from scratch with the master replica.
>>>>
>>>>                 I guess the next step would be to do the
>>>>                 ipa-replica-install using -ddd and review the extra
>>>>                 debug information that comes out.
>>>>
>>>>
>>>>>
>>>>>
>>>>>                 On Wed, Apr 2, 2014 at 11:45 AM, Rob Crittenden
>>>>>                 <rcritten at redhat.com <mailto:rcritten at redhat.com>>
>>>>>                 wrote:
>>>>>
>>>>>                     Rich Megginson wrote:
>>>>>
>>>>>                         On 04/02/2014 09:20 AM, Nevada Sanchez wrote:
>>>>>
>>>>>                             Okay, we might be on to something:
>>>>>
>>>>>                             ipa -> ipa2
>>>>>                             ================================
>>>>>                             $
>>>>>                             LDAPTLS_CACERTDIR=/etc/dirsrv/slapd-EXAMPLE-COM
>>>>>                             ldapsearch -xLLLZZ
>>>>>                             -h ipa2.example.com
>>>>>                             <http://ipa2.example.com>
>>>>>                             <http://ipa2.example.com> -s base -b ""
>>>>>
>>>>>                             'objectclass=*' vendorVersion
>>>>>                             dn:
>>>>>                             vendorVersion:
>>>>>                             389-Directory/1.3.1.22.a1 B2014.073.1751
>>>>>                             ================================
>>>>>
>>>>>                             ipa2 -> ipa
>>>>>                             ================================
>>>>>                             $
>>>>>                             LDAPTLS_CACERTDIR=/etc/dirsrv/slapd-EXAMPLE-COM
>>>>>                             ldapsearch -xLLLZZ
>>>>>                             -h ipa.example.com
>>>>>                             <http://ipa.example.com>
>>>>>                             <http://ipa.example.com> -s base -b ""
>>>>>
>>>>>                             'objectclass=*' vendorVersion
>>>>>                             ldap_start_tls: Connect error (-11)
>>>>>                             additional info: TLS error
>>>>>                             -8172:Peer's certificate issuer has been
>>>>>                             marked as not trusted by the user.
>>>>>                             ================================
>>>>>
>>>>>                             The original IPA trusts the replica
>>>>>                             (since it signed the cert, I
>>>>>                             assume), but the replica doesn't trust
>>>>>                             the main IPA server. I guess
>>>>>                             the ZZ option would have shown me the
>>>>>                             failure that I missed in my
>>>>>                             initial ldapsearch tests.
>>>>>
>>>>>                         -Z[Z]  Issue StartTLS (Transport Layer
>>>>>                         Security) extended
>>>>>                         operation. If
>>>>>                          you  use  -ZZ, the command will require
>>>>>                         the operation to
>>>>>                         be suc-
>>>>>                          cessful.
>>>>>
>>>>>                         i.e. use SSL, and force a successful handshake
>>>>>
>>>>>
>>>>>                             Anyway, what's the best way to remedy
>>>>>                             this in a way that makes IPA
>>>>>                             happy? (I've found that LDAP can have
>>>>>                             different requirements on which
>>>>>                             certs go where).
>>>>>
>>>>>
>>>>>                         I'm not sure.
>>>>>                         ipa-server-install/ipa-replica-prepare/ipa-replica-install
>>>>>                         is supposed to take care of installing the
>>>>>                         CA cert properly for you. If
>>>>>                         you try to hack it and install the CA cert
>>>>>                         manually, you will probably
>>>>>                         miss something else that ipa install did
>>>>>                         not do.
>>>>>
>>>>>                         I think the only way to ensure that you
>>>>>                         have a properly configured ipa
>>>>>                         server + replicas is to get all of the ipa
>>>>>                         commands completing successfully.
>>>>>
>>>>>                         Which means going back to the drawing
>>>>>                         board and starting over from scratch.
>>>>>
>>>>>
>>>>>                     You can compare the certs that each side is
>>>>>                     using with:
>>>>>
>>>>>                     # certutil -L -d /etc/dirsrv/slapd-EXAMPLE-COM
>>>>>
>>>>>                     Did you by chance replace the SSL server certs
>>>>>                     that IPA uses on your working master?
>>>>>
>>>>>                     rob
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/freeipa-users/attachments/20140404/762c228c/attachment.htm>


More information about the Freeipa-users mailing list