[Freeipa-users] multi-master replication

Craig White CWhite at skytouchtechnology.com
Mon Oct 27 16:12:54 UTC 2014


From: freeipa-users-bounces at redhat.com [mailto:freeipa-users-bounces at redhat.com] On Behalf Of Rich Megginson
Sent: Monday, October 27, 2014 6:42 AM
To: freeipa-users at redhat.com
Subject: Re: [Freeipa-users] multi-master replication

On 10/25/2014 06:17 PM, Dmitri Pal wrote:
On 10/24/2014 07:15 PM, Craig White wrote:
From: freeipa-users-bounces at redhat.com<mailto:freeipa-users-bounces at redhat.com> [mailto:freeipa-users-bounces at redhat.com] On Behalf Of Craig White
Sent: Friday, October 24, 2014 4:02 PM
To: freeipa-users at redhat.com<mailto:freeipa-users at redhat.com>
Subject: [Freeipa-users] multi-master replication

I would have thought that changes go from replica to master and not just master to replica.

Is there something I have to do to make the changes bi-directional?

Replying to my own post...
Logs are my friend  ;-)

[24/Oct/2014:23:08:17 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.domain.local " (ipa001:389): Replication bind with GSSAPI auth resumed
[24/Oct/2014:23:08:17 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.domain.local " (ipa001:389): Warning: unable to replicate schema: rc=2
[24/Oct/2014:23:08:17 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.domain.local " (ipa001:389): Failed to send update operation to consumer (uniqueid e018060f-5bb011e4-81078979-dc802980, CSN 544aa346000000030000): Can't contact LDAP server. Will retry later.
[24/Oct/2014:23:08:17 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.domain.local " (ipa001:389): Consumer failed to replay change (uniqueid (null), CSN (null)): Can't contact LDAP server(-1). Will retry later.

These NULLs look suspicious.
I hope DS gurus will have more for you on Monday.

1) Yes, replication is fully bi-directional.
2) What are the exact versions of dirsrv?  rpm -q 389-ds-base on supplier and consumer.
3) Can you reproduce the problem using the replication log level on both the supplier and consumer?  http://www.port389.org/docs/389ds/FAQ/faq.html#troubleshooting
----

1)      I sort of expected it to be - that's what had me worried when it didn't work as expected.

2)      ]# rpm -qa | grep 389

389-ds-base-1.2.11.15-47.el6.x86_64

389-ds-base-libs-1.2.11.15-47.el6.x86_64

3)      Upped the log levels - hopefully this is a reasonable representative from ipa002 (the replica)

[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - conn=4 op=229791 repl="dc=stt,dc=local": Acquired replica
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - conn=4 op=229791 repl="dc=stt,dc=local": StartNSDS90ReplicationRequest: response=0 rc=0
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - conn=4 op=229791 Relinquishing consumer connection extension
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - ruv_add_csn_inprogress: successfully inserted csn 544e6b02000000040000 into pending list
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - Purged state information from entry krbprincipalname=ldap/ipa002.FQDN at FQDN,cn=services,cn=accounts,dc=stt,dc=local up to CSN 5445307e000000040000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - changelog program - _cl5GetDBFileByReplicaName: found DB object f32b80 for database /var/lib/dirsrv/slapd-STT-LOCAL/cldb/dababc08-5bb011e4-81078979-dc802980_544aa332000000040000.db4
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - changelog program - _cl5GetDBFileByReplicaName: found DB object f32b80 for database /var/lib/dirsrv/slapd-STT-LOCAL/cldb/dababc08-5bb011e4-81078979-dc802980_544aa332000000040000.db4
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - ruv_update_ruv: successfully committed csn 544e6b02000000040000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): Disconnected from the consumer
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): Warning: unable to replicate schema: rc=2
[27/Oct/2014:15:55:47 +0000] - csngen_adjust_time: gen state before 544e6b040001:1414425347:0:1
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - changelog program - _cl5GetDBFile: found DB object f32b80 for database /var/lib/dirsrv/slapd-STT-LOCAL/cldb/dababc08-5bb011e4-81078979-dc802980_544aa332000000040000.db4
[27/Oct/2014:15:55:47 +0000] - _cl5PositionCursorForReplay (agmt="cn=meToipa001.FQDN" (ipa001:389)): Consumer RUV:
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): {replicageneration} 544aa332000000040000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): {replica 4 ldap://ipa001.FQDN:389} 544aa3a1000000040000 544e6b02000000040000 00000000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): {replica 3 ldap://ipa002.FQDN:389} 544aa339000000030000 544aa33f000400030000 00000000
[27/Oct/2014:15:55:47 +0000] - _cl5PositionCursorForReplay (agmt="cn=meToipa001.FQDN" (ipa001:389)): Supplier RUV:
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): {replicageneration} 544aa332000000040000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): {replica 3 ldap://ipa002.FQDN:389} 544aa339000000030000 544e6aa4000300030000 544e6aa3
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): {replica 4 ldap://ipa001.FQDN:389} 544aa3a1000000040000 544e6b02000000040000 544e6b03
[27/Oct/2014:15:55:47 +0000] agmt="cn=meToipa001.FQDN" (ipa001:389) - clcache_get_buffer: found thread private buffer cache 7fe85000ea50
[27/Oct/2014:15:55:47 +0000] agmt="cn=meToipa001.FQDN" (ipa001:389) - clcache_get_buffer: _pool is f89af0 _pool->pl_busy_lists is 7fe8500197d0 _pool->pl_busy_lists->bl_buffers is 7fe85000ea50
[27/Oct/2014:15:55:47 +0000] agmt="cn=meToipa001.FQDN" (ipa001:389) - session start: anchorcsn=544aa33f000400030000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - changelog program - agmt="cn=meToipa001.FQDN" (ipa001:389): CSN 544aa33f000400030000 found, position set for replay
[27/Oct/2014:15:55:47 +0000] agmt="cn=meToipa001.FQDN" (ipa001:389) - load=1 rec=1 csn=544aa346000000030000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): replay_update: Sending add operation (dn="cn=KDC,cn=ipa002.FQDN,cn=masters,cn=ipa,cn=etc,dc=stt,dc=local" csn=544aa346000000030000)
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): replay_update: Consumer could not replay operation with csn 544aa346000000030000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): Failed to send update operation to consumer (uniqueid e018060f-5bb011e4-81078979-dc802980, CSN 544aa346000000030000): Can't contact LDAP server. Will retry later.
[27/Oct/2014:15:55:47 +0000] - repl5_inc_result_threadmain starting
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): Connection disconnected by another thread
[27/Oct/2014:15:55:47 +0000] - repl5_inc_result_threadmain: read result for message_id 0
[27/Oct/2014:15:55:47 +0000] - repl5_inc_result_threadmain: result 3, -1, 2, 0, (null)
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): Consumer failed to replay change (uniqueid (null), CSN (null)): Can't contact LDAP server(-1). Will retry later.
[27/Oct/2014:15:55:47 +0000] - repl5_inc_result_threadmain: got op result 205 should finish 1
[27/Oct/2014:15:55:47 +0000] - repl5_inc_result_threadmain exiting
[27/Oct/2014:15:55:47 +0000] agmt="cn=meToipa001.FQDN" (ipa001:389) - session end: state=0 load=1 sent=1 skipped=0 skipped_new_rid=0 skipped_csn_gt_cons_maxcsn=0 skipped_up_to_date=0 skipped_csn_gt_ruv=0 skipped_csn_covered=0
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): Warning: unable to send endReplication extended operation (Can't contact LDAP server)
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): State: sending_updates -> start_backoff
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): State: start_backoff -> start_backoff
[27/Oct/2014:15:55:49 +0000] NSMMReplicationPlugin - conn=4 op=229793 Acquired consumer connection extension
[27/Oct/2014:15:55:49 +0000] NSMMReplicationPlugin - conn=4 op=229793 repl="dc=stt,dc=local": Released replica held by locking_purl=conn=4 id=229791
[27/Oct/2014:15:55:49 +0000] NSMMReplicationPlugin - conn=4 op=229793 Relinquishing consumer connection extension
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): State: start_backoff -> backoff
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): Trying non-secure slapi_ldap_init_ext
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): binddn = ,  passwd =
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): Replication bind with GSSAPI auth resumed
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): No linger to cancel on the connection
[27/Oct/2014:15:55:51 +0000] - _csngen_adjust_local_time: gen state before 544e6b040001:1414425347:0:1
[27/Oct/2014:15:55:51 +0000] - _csngen_adjust_local_time: gen state after 544e6b080000:1414425351:0:1
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): Replica was successfully acquired.
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" (ipa001:389): State: backoff -> sending_updates
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - conn=4 op=229794 Acquired consumer connection extension
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - conn=4 op=229794 repl="dc=stt,dc=local": Begin incremental protocol
[27/Oct/2014:15:55:51 +0000] - csngen_adjust_time: gen state before 544e6b080001:1414425351:0:1
[27/Oct/2014:15:55:51 +0000] - csngen_adjust_time: gen state after 544e6b080001:1414425351:0:1

This is the log from ipa002 (which was the replica) and I am confused by the error logged, 'Can't contact LDAP server' because it surely can...
]# telnet 172.29.31.100 389
Trying 172.29.31.100...
Connected to 172.29.31.100.
Escape character is '^]'.
^]
telnet> quit
Connection closed.

This is a new setup (setup master on Thursday, imported users from OpenLDAP and then setup replica on Friday). Had a few issues that I didn't understand which may or may not be relevant. First, after I made some changes to SSSD to allow for sudo and hopefully HBAC, it seemed like the directory server crashed and I had to restart it. Seemed like the same thing happened at the same time when I likewise configured the replica. Also, when I was trying to setup the replica, I had to resort to -skip-conncheck as it (presume Kerberos) not only asked for the password for admin@$DOMAIN but also the password for admin at ipa001.$DOMAIN<mailto:admin at ipa001.$DOMAIN> for which I had never specifically setup any password and no password already used in Kerberos worked.

Thanks

Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/freeipa-users/attachments/20141027/291c62ca/attachment.htm>


More information about the Freeipa-users mailing list