<html> <head> <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> Hi Oleg, the problem seems to be on replica2, when it logs this error: [16/Jun/2015:10:18:34 -0400] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=55802fcf000300040000) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [16/Jun/2015:10:18:34 -0400] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (55802fcf000300040000); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock [16/Jun/2015:10:18:34 -0400] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net (uniqid: b8242e18-143111e5-b1d0d0c3-ae5854ff, optype: 32) to changelog csn 55802fcf000300040000 [16/Jun/2015:10:18:34 -0400] - SLAPI_PLUGIN_BE_TXN_POST_DELETE_FN plugin returned error code but did not set SLAPI_RESULT_CODE but replication seems to continue and not to repeat this: [16/Jun/2015:10:18:27 -0400] conn=8 op=6 DEL dn="uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net" [16/Jun/2015:10:18:35 -0400] conn=8 op=6 RESULT err=1 tag=107 nentries=0 etime=8 csn=55802fcf000300040000 [16/Jun/2015:10:18:35 -0400] conn=8 op=7 MOD dn="cn=ipausers,cn=groups,cn=accounts,dc=bagam,dc=net" [16/Jun/2015:10:18:36 -0400] conn=8 op=7 RESULT err=0 tag=103 nentries=0 etime=1 csn=55802fcf000400040000 [16/Jun/2015:10:18:36 -0400] conn=8 op=8 DEL dn="cn=onmaster,cn=groups,cn=accounts,dc=bagam,dc=net" [16/Jun/2015:10:18:37 -0400] conn=8 op=8 RESULT err=0 tag=107 nentries=0 etime=1 csn=55802fcf000700040000 [16/Jun/2015:10:18:37 -0400] conn=8 op=9 MOD dn="cn=ipausers,cn=groups,cn=accounts,dc=bagam,dc=net" [16/Jun/2015:10:18:37 -0400] conn=8 op=9 RESULT err=0 tag=103 nentries=0 etime=0 csn=55802fd0000000060000 I don't see why there is a deadlock ? Is it reproducable every time ? <div class="moz-cite-prefix">On 06/16/2015 04:49 PM, Oleg Fayans wrote: </div> <blockquote cite="mid:55803767.5080906@redhat.com" type="cite">Hi all, I've bumped into a strange problem with only a part of changes implemented on master during replica outage get replicated after replica recovery. Namely: when I delete an existing user on the master while the node is offline, these changes do not get to the node when it's back online. User creation, however, gets replicated as expected. Steps to reproduce: 1. Create the following tolopogy: replica1 <-> master <-> replica2 <-> replica3 2. Create user1 on master, make sure it appears on all replicas 3. Turn off replica2 4. On master delete user1 and create user2, make sure the changes get replicated to replica1 5. Turn on replica2 Expected results: A minute or so after repica2 is back up, 1. user1 does not exist neither on replica2 nor on replica3 2. user2 exists both on replica2 and replica3 Actual results: 1. user1 coexist with user2 on replica2 and replica3 2. master and replica1 have only user2 In my case, though, the topology was as follows: $ ipa topologysegment-find realm ------------------ 3 segments matched ------------------ Segment name: f22master.bagam.net-to-f22replica3.bagam.net Left node: f22master.bagam.net Right node: f22replica3.bagam.net Connectivity: both Segment name: replica1-to-replica2 Left node: f22replica1.bagam.net Right node: f22replica2.bagam.net Connectivity: both Segment name: replica2-to-master Left node: f22replica2.bagam.net Right node: f22master.bagam.net Connectivity: both ---------------------------- Number of entries returned 3 ---------------------------- And I was turning off replica2, leaving replica1 offline, but that does not really matter. The dirsrv error message, most likely to be relevant is: ----------------------------------------------------------------------------------------------------------------------------------------------------- Consumer failed to replay change (uniqueid b8242e18-143111e5-b1d0d0c3-ae5854ff, CSN 55802fcf000300040000): Operations error (1). Will retry later ----------------------------------------------------------------------------------------------------------------------------------------------------- I attach dirsrv error and access logs from all nodes, in case they could be useful <fieldset class="mimeAttachmentHeader"></fieldset> </blockquote> </body> </html>