[Freeipa-devel] user deletion in offline mode does not get replicated after node recovery

Ludwig Krispenz lkrispen at redhat.com
Tue Jun 16 15:33:52 UTC 2015


Hi Oleg,

the problem seems to be on replica2, when it logs this error:

[16/Jun/2015:10:18:34 -0400] NSMMReplicationPlugin - changelog program - 
_cl5WriteOperationTxn: retry (49) the transaction 
(csn=55802fcf000300040000) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: 
Locker killed to resolve a deadlock))
[16/Jun/2015:10:18:34 -0400] NSMMReplicationPlugin - changelog program - 
_cl5WriteOperationTxn: failed to write entry with csn 
(55802fcf000300040000); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: 
Locker killed to resolve a deadlock
[16/Jun/2015:10:18:34 -0400] NSMMReplicationPlugin - 
write_changelog_and_ruv: can't add a change for 
uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net (uniqid: 
b8242e18-143111e5-b1d0d0c3-ae5854ff, optype: 32) to changelog csn 
55802fcf000300040000
[16/Jun/2015:10:18:34 -0400] - SLAPI_PLUGIN_BE_TXN_POST_DELETE_FN plugin 
returned error code but did not set SLAPI_RESULT_CODE

but replication seems to continue and not to repeat this:

[16/Jun/2015:10:18:27 -0400] conn=8 op=6 DEL 
dn="uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net"
[16/Jun/2015:10:18:35 -0400] conn=8 op=6 RESULT err=1 tag=107 nentries=0 
etime=8 csn=55802fcf000300040000
[16/Jun/2015:10:18:35 -0400] conn=8 op=7 MOD 
dn="cn=ipausers,cn=groups,cn=accounts,dc=bagam,dc=net"
[16/Jun/2015:10:18:36 -0400] conn=8 op=7 RESULT err=0 tag=103 nentries=0 
etime=1 csn=55802fcf000400040000
[16/Jun/2015:10:18:36 -0400] conn=8 op=8 DEL 
dn="cn=onmaster,cn=groups,cn=accounts,dc=bagam,dc=net"
[16/Jun/2015:10:18:37 -0400] conn=8 op=8 RESULT err=0 tag=107 nentries=0 
etime=1 csn=55802fcf000700040000
[16/Jun/2015:10:18:37 -0400] conn=8 op=9 MOD 
dn="cn=ipausers,cn=groups,cn=accounts,dc=bagam,dc=net"
[16/Jun/2015:10:18:37 -0400] conn=8 op=9 RESULT err=0 tag=103 nentries=0 
etime=0 csn=55802fd0000000060000

I don't see why there is a deadlock ?

Is it reproducable every time ?


On 06/16/2015 04:49 PM, Oleg Fayans wrote:
> Hi all,
>
> I've bumped into a strange problem with only a part of changes 
> implemented on master during replica outage get replicated after 
> replica recovery.
>
> Namely: when I delete an existing user on the master while the node is 
> offline, these changes do not get to the node when it's back online. 
> User creation, however, gets replicated as expected.
>
> Steps to reproduce:
>
> 1. Create the following tolopogy:
>
> replica1 <-> master <-> replica2 <-> replica3
>
> 2. Create user1 on master, make sure it appears on all replicas
> 3. Turn off replica2
> 4. On master delete user1 and create user2, make sure the changes get 
> replicated to replica1
> 5. Turn on replica2
>
> Expected results:
>
> A minute or so after repica2 is back up,
> 1. user1 does not exist neither on replica2 nor on replica3
> 2. user2 exists both on replica2 and replica3
>
> Actual results:
> 1. user1 coexist with user2 on replica2 and replica3
> 2. master and replica1 have only user2
>
>
> In my case, though, the topology was as follows:
> $ ipa topologysegment-find realm
> ------------------
> 3 segments matched
> ------------------
>   Segment name: f22master.bagam.net-to-f22replica3.bagam.net
>   Left node: f22master.bagam.net
>   Right node: f22replica3.bagam.net
>   Connectivity: both
>
>   Segment name: replica1-to-replica2
>   Left node: f22replica1.bagam.net
>   Right node: f22replica2.bagam.net
>   Connectivity: both
>
>   Segment name: replica2-to-master
>   Left node: f22replica2.bagam.net
>   Right node: f22master.bagam.net
>   Connectivity: both
> ----------------------------
> Number of entries returned 3
> ----------------------------
> And I was turning off replica2, leaving replica1 offline, but that 
> does not really matter.
>
> The dirsrv error message, most likely to be relevant is:
> ----------------------------------------------------------------------------------------------------------------------------------------------------- 
>
> Consumer failed to replay change (uniqueid 
> b8242e18-143111e5-b1d0d0c3-ae5854ff, CSN 55802fcf000300040000): 
> Operations error (1). Will retry later
> ----------------------------------------------------------------------------------------------------------------------------------------------------- 
>
>
> I attach dirsrv error and access logs from all nodes, in case they 
> could be useful
>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/freeipa-devel/attachments/20150616/a8cf5bc6/attachment.htm>


More information about the Freeipa-devel mailing list