[Freeipa-users] IPA Replica Issues (Total update abortedLDAP error: Can't contact LDAP server)
Rich Megginson
rmeggins at redhat.com
Sat Apr 5 01:16:28 UTC 2014
On 04/03/2014 10:25 PM, Nevada Sanchez wrote:
> I followed the instructions that would give me a core dump, and for
> some reason, I don't see one in /var/log/dirsrv/slapd-EXAMPLE-COM/,
> even though I still see the Disorderly shutdown still shows up in the
> logs.
Hmm - check again - it should produce a core file
grep -i segfault /var/log/messages
> I know that when I explicitly request those attributes, I get "-1
> Total update abortedLDAP error: Can't contact LDAP server" for
> nds5ReplicaLastInitStatus (see below). Access logs stop completely on
> the replica after the time that you mentioned.
Hmm - looks like a bug. Please open a ticket.
>
> ======================================================
> [root at ipa2 ipaserver]# ldapsearch ldaps://ipa.example.com:636
> <http://ipa.example.com:636/> -D 'cn=Directory Manager' -w ##### -b
> 'cn=meToipa2.example.com
> <http://metoipa2.example.com/>,cn=replica,cn=dc\=example\,dc\=com,cn=mapping
> tree,cn=config' '(objectClass=*)' -s base nsds5ReplicaLastInitStart
> nsds5replicaUpdateInProgress nsds5ReplicaLastInitStatus cn
> nsds5BeginReplicaRefresh nsds5ReplicaLastInitEnd
> # extended LDIF
> #
> # LDAPv3
> # base <cn=meToipa2.example.com
> <http://metoipa2.example.com/>,cn=replica,cn=dc\=example\,dc\=com,cn=mapping
> tree,cn=config> with scope baseObject
> # filter: (objectclass=*)
> # requesting: ldaps://ipa.example.com:636
> <http://ipa.example.com:636/> (objectClass=*)
> nsds5ReplicaLastInitStart nsds5replicaUpdateInProgress
> nsds5ReplicaLastInitStatus cn nsds5BeginReplicaRefresh
> nsds5ReplicaLastInitEnd
> #
>
> # meToipa2.example.com <http://metoipa2.example.com/>, replica,
> dc\3Dexample\2Cdc\3Dcom,
> mapping tree, config
> dn: cn=meToipa2.example.com
> <http://metoipa2.example.com/>,cn=replica,cn=dc\3Dexample\2Cd
> c\3Dcom,cn=mapping tree,cn=config
> nsds5ReplicaLastInitStart: 20140401092800Z
> nsds5replicaUpdateInProgress: FALSE
> nsds5ReplicaLastInitStatus: -1 Total update abortedLDAP error: Can't
> contact L
> DAP server
> cn: meToipa2.example.com <http://metoipa2.example.com/>
> nsds5ReplicaLastInitEnd: 20140401092804Z
>
> # search result
> search: 2
> result: 0 Success
>
> # numResponses: 2
> # numEntries: 1
>
>
> On Thu, Apr 3, 2014 at 6:32 PM, Rich Megginson <rmeggins at redhat.com
> <mailto:rmeggins at redhat.com>> wrote:
>
> On 04/03/2014 03:46 PM, Nevada Sanchez wrote:
>> Okay, I updated the gist and extended some of the logs
>> (ipa2-errors does stop at 20:50:21). I'll follow up when I have
>> the debug stuff in place.
>>
>> https://gist.github.com/nevsan/8b6f78d7396963dc5f70
>
> Another strange thing - it looks as if the initial replica init
> completes successfully.
>
> [02/Apr/2014:20:50:18 +0000] NSMMReplicationPlugin - Beginning
> total update of replica "agmt="cn=meToipa2.example.com
> <http://meToipa2.example.com>" (ipa2:389)".
>
> On the replica:
>
> [02/Apr/2014:20:50:18 +0000] NSMMReplicationPlugin -
> multimaster_be_state_change: replica dc=example,dc=com is going
> offline; disabling replication
> [02/Apr/2014:20:50:18 +0000] - WARNING: Import is running with
> nsslapd-db-private-import-mem on; No other process is allowed to
> access the database
> [02/Apr/2014:20:50:21 +0000] - import userRoot: Workers finished;
> cleaning up...
> [02/Apr/2014:20:50:21 +0000] - import userRoot: Workers cleaned up.
> [02/Apr/2014:20:50:21 +0000] - import userRoot: Indexing complete.
> Post-processing...
> [02/Apr/2014:20:50:21 +0000] - import userRoot: Generating
> numSubordinates complete.
> [02/Apr/2014:20:50:21 +0000] - import userRoot: Flushing caches...
> [02/Apr/2014:20:50:21 +0000] - import userRoot: Closing files...
> [02/Apr/2014:20:50:21 +0000] - import userRoot: Import complete.
> Processed 453 entries in 3 seconds. (151.00 entries/sec)
> [02/Apr/2014:20:50:21 +0000] NSMMReplicationPlugin -
> multimaster_be_state_change: replica dc=example,dc=com is coming
> online; enabling replication
>
> On the master, access log:
>
> [02/Apr/2014:20:50:17 +0000] conn=1365 op=15 MOD
> dn="cn=meToipa2.example.com
> <http://meToipa2.example.com>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
> tree,cn=config"
>
> This is the operation that triggers the replica init. Then
> ipa-replica-install polls for agreement status:
> [02/Apr/2014:20:50:19 +0000] conn=1365 op=16 SRCH
> base="cn=meToipa2.example.com
> <http://meToipa2.example.com>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
> tree,cn=config" scope=0 filter="(objectClass=*)"
> attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress
> nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh
> nsds5replicaLastInitEnd"
> [02/Apr/2014:20:50:19 +0000] conn=1365 op=16 RESULT err=0 tag=101
> nentries=1 etime=0
> [02/Apr/2014:20:50:20 +0000] conn=1365 op=17 SRCH
> base="cn=meToipa2.example.com
> <http://meToipa2.example.com>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
> tree,cn=config" scope=0 filter="(objectClass=*)"
> attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress
> nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh
> nsds5replicaLastInitEnd"
> [02/Apr/2014:20:50:20 +0000] conn=1365 op=17 RESULT err=0 tag=101
> nentries=1 etime=0
> [02/Apr/2014:20:50:21 +0000] conn=1365 op=18 SRCH
> base="cn=meToipa2.example.com
> <http://meToipa2.example.com>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
> tree,cn=config" scope=0 filter="(objectClass=*)"
> attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress
> nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh
> nsds5replicaLastInitEnd"
> [02/Apr/2014:20:50:21 +0000] conn=1365 op=18 RESULT err=0 tag=101
> nentries=1 etime=0
> [02/Apr/2014:20:50:22 +0000] conn=1365 op=19 SRCH
> base="cn=meToipa2.example.com
> <http://meToipa2.example.com>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
> tree,cn=config" scope=0 filter="(objectClass=*)"
> attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress
> nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh
> nsds5replicaLastInitEnd"
> [02/Apr/2014:20:50:22 +0000] conn=1365 op=19 RESULT err=0 tag=101
> nentries=1 etime=1
>
> Something happens here. The replica init is done, according to
> the replica error log. We don't have the replica access log from
> around this time to see exactly when the connection was closed,
> but looking at the ipa code, it would appear that ipa did not see
> a status of "Total update succeeded". Not sure why the master
> would not have reported that, unless there was some problem
> getting back the status from the replica.
>
> [02/Apr/2014:20:50:22 +0000] conn=1365 op=20 UNBIND
> [02/Apr/2014:20:50:22 +0000] conn=1365 op=20 fd=114 closed - U1
>
> Then ipa-replica-install closes the connection and reports the error.
>
>
>>
>>
>> On Thu, Apr 3, 2014 at 10:38 AM, Rich Megginson
>> <rmeggins at redhat.com <mailto:rmeggins at redhat.com>> wrote:
>>
>> On 04/02/2014 09:22 PM, Nevada Sanchez wrote:
>>> Okay. Updated the gist with the additional logs:
>>> https://gist.github.com/nevsan/8b6f78d7396963dc5f70
>>>
>>>
>>
>> 1) Dirsrv is crashing:
>> [02/Apr/2014:20:49:53 +0000] - 389-Directory/1.3.1.22.a1
>> B2014.073.1751 starting up
>> [02/Apr/2014:20:49:54 +0000] - Db home directory is not set.
>> Possibly nsslapd-directory (optionally
>> nsslapd-db-home-directory) is missing in the config file.
>> [02/Apr/2014:20:49:54 +0000] - I'm resizing my cache
>> now...cache was 710029312 and is now 8000000
>> [02/Apr/2014:20:49:54 +0000] - 389-Directory/1.3.1.22.a1
>> B2014.073.1751 starting up
>> [02/Apr/2014:20:49:54 +0000] - Detected Disorderly Shutdown
>> last time Directory Server was running, recovering database.
>> [02/Apr/2014:20:49:55 +0000] - slapd started. Listening on
>> All Interfaces port 389 for LDAP requests
>>
>> Please use the instructions at
>> http://port389.org/wiki/FAQ#Debugging_Crashes to get a core
>> dump and stack trace.
>>
>> 2) The first occurrence of the connection error is at
>> [02/Apr/2014:20:52:38 +0000] but there isn't anything in the
>> consumer error log after [02/Apr/2014:20:50:21 +0000] and in
>> the consumer access log after [02/Apr/2014:20:50:22 +0000]
>>
>>
>>> On Wed, Apr 2, 2014 at 9:38 PM, Rich Megginson
>>> <rmeggins at redhat.com <mailto:rmeggins at redhat.com>> wrote:
>>>
>>> On 04/02/2014 03:01 PM, Nevada Sanchez wrote:
>>>> Okay, I ran it with debug on. The output is quite
>>>> large. I'm not sure what the etiquette is for posting
>>>> large logs, so I threw it on gist here:
>>>> https://gist.githubusercontent.com/nevsan/8b6f78d7396963dc5f70/raw/b76b3c3acce4f12d292d680f4c1dab39c05888d5/gistfile1.txt
>>>> <http://gist.githubusercontent.com/nevsan/8b6f78d7396963dc5f70/raw/b76b3c3acce4f12d292d680f4c1dab39c05888d5/gistfile1.txt>
>>>>
>>>>
>>>> Let me know if I should copy it into the thread instead.
>>>
>>> Ok. Now can you post excerpts from the dirsrv errors
>>> log from both the master replica and the replica from
>>> around the time of the failure?
>>>
>>>
>>>>
>>>>
>>>> On Wed, Apr 2, 2014 at 1:49 PM, Rich Megginson
>>>> <rmeggins at redhat.com <mailto:rmeggins at redhat.com>> wrote:
>>>>
>>>> On 04/02/2014 11:45 AM, Nevada Sanchez wrote:
>>>>> My apologies. I mistakenly ran the failing
>>>>> ldapsearch from an unpriviliged user (couldn't
>>>>> read slapd-EXAMPLE-COM directory). Running as
>>>>> root, it now works just fine (same result as the
>>>>> one that worked). SSL seems to not be the issue.
>>>>> Also, I haven't change the SSL certs since I first
>>>>> set up the master.
>>>>>
>>>>> I have been doing the replica side things from
>>>>> scratch (even so far as starting with a new
>>>>> machine). For the master side, I have just been
>>>>> re-preparing the replica. I hope I don't have to
>>>>> start from scratch with the master replica.
>>>>
>>>> I guess the next step would be to do the
>>>> ipa-replica-install using -ddd and review the extra
>>>> debug information that comes out.
>>>>
>>>>
>>>>>
>>>>>
>>>>> On Wed, Apr 2, 2014 at 11:45 AM, Rob Crittenden
>>>>> <rcritten at redhat.com <mailto:rcritten at redhat.com>>
>>>>> wrote:
>>>>>
>>>>> Rich Megginson wrote:
>>>>>
>>>>> On 04/02/2014 09:20 AM, Nevada Sanchez wrote:
>>>>>
>>>>> Okay, we might be on to something:
>>>>>
>>>>> ipa -> ipa2
>>>>> ================================
>>>>> $
>>>>> LDAPTLS_CACERTDIR=/etc/dirsrv/slapd-EXAMPLE-COM
>>>>> ldapsearch -xLLLZZ
>>>>> -h ipa2.example.com
>>>>> <http://ipa2.example.com>
>>>>> <http://ipa2.example.com> -s base -b ""
>>>>>
>>>>> 'objectclass=*' vendorVersion
>>>>> dn:
>>>>> vendorVersion:
>>>>> 389-Directory/1.3.1.22.a1 B2014.073.1751
>>>>> ================================
>>>>>
>>>>> ipa2 -> ipa
>>>>> ================================
>>>>> $
>>>>> LDAPTLS_CACERTDIR=/etc/dirsrv/slapd-EXAMPLE-COM
>>>>> ldapsearch -xLLLZZ
>>>>> -h ipa.example.com
>>>>> <http://ipa.example.com>
>>>>> <http://ipa.example.com> -s base -b ""
>>>>>
>>>>> 'objectclass=*' vendorVersion
>>>>> ldap_start_tls: Connect error (-11)
>>>>> additional info: TLS error
>>>>> -8172:Peer's certificate issuer has been
>>>>> marked as not trusted by the user.
>>>>> ================================
>>>>>
>>>>> The original IPA trusts the replica
>>>>> (since it signed the cert, I
>>>>> assume), but the replica doesn't trust
>>>>> the main IPA server. I guess
>>>>> the ZZ option would have shown me the
>>>>> failure that I missed in my
>>>>> initial ldapsearch tests.
>>>>>
>>>>> -Z[Z] Issue StartTLS (Transport Layer
>>>>> Security) extended
>>>>> operation. If
>>>>> you use -ZZ, the command will require
>>>>> the operation to
>>>>> be suc-
>>>>> cessful.
>>>>>
>>>>> i.e. use SSL, and force a successful handshake
>>>>>
>>>>>
>>>>> Anyway, what's the best way to remedy
>>>>> this in a way that makes IPA
>>>>> happy? (I've found that LDAP can have
>>>>> different requirements on which
>>>>> certs go where).
>>>>>
>>>>>
>>>>> I'm not sure.
>>>>> ipa-server-install/ipa-replica-prepare/ipa-replica-install
>>>>> is supposed to take care of installing the
>>>>> CA cert properly for you. If
>>>>> you try to hack it and install the CA cert
>>>>> manually, you will probably
>>>>> miss something else that ipa install did
>>>>> not do.
>>>>>
>>>>> I think the only way to ensure that you
>>>>> have a properly configured ipa
>>>>> server + replicas is to get all of the ipa
>>>>> commands completing successfully.
>>>>>
>>>>> Which means going back to the drawing
>>>>> board and starting over from scratch.
>>>>>
>>>>>
>>>>> You can compare the certs that each side is
>>>>> using with:
>>>>>
>>>>> # certutil -L -d /etc/dirsrv/slapd-EXAMPLE-COM
>>>>>
>>>>> Did you by chance replace the SSL server certs
>>>>> that IPA uses on your working master?
>>>>>
>>>>> rob
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/freeipa-users/attachments/20140404/762c228c/attachment.htm>
More information about the Freeipa-users
mailing list