[Freeipa-devel] user deletion in offline mode does not get replicated after node recovery
thierry bordaz
tbordaz at redhat.com
Tue Jun 16 17:02:02 UTC 2015
Hello
On Master:
User 'onmaster' was deleted
[16/Jun/2015:10:16:45 -0400] conn=402 op=19 SRCH
base="cn=otp,dc=bagam,dc=net" scope=1
filter="(&(objectClass=ipatoken)(ipatokenOwner=uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net))"
attrs="ipatokenNotAfter description ipatokenOwner objectClass
ipatokenDisabled ipatokenVendor managedBy ipatokenModel
ipatokenNotBefore ipatokenUniqueID ipatokenSerial"
[16/Jun/2015:10:16:45 -0400] conn=402 op=19 RESULT err=0 tag=101
nentries=0 etime=0
[16/Jun/2015:10:16:45 -0400] conn=402 op=20 DEL
dn="uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net"
[16/Jun/2015:10:16:45 -0400] conn=402 op=21 UNBIND
[16/Jun/2015:10:16:45 -0400] conn=402 op=21 fd=120 closed - U1
[16/Jun/2015:10:16:45 -0400] conn=402 op=20 RESULT err=0 tag=107
nentries=0 etime=0 csn=55802fcf000300040000
Replication agreement failed to replicate it to the replica2
[16/Jun/2015:10:18:36 -0400] NSMMReplicationPlugin -
agmt="cn=f22master.bagam.net-to-f22replica2.bagam.net"
(f22replica2:389): Consumer failed to replay change (uniqueid
b8242e18-143111e5-b1d0d0c3-ae5854ff, CSN 55802fcf000300040000):
Operations error (1). Will retry later.
On replica2:
The replicated operation failed
[16/Jun/2015:10:18:27 -0400] conn=8 op=4 RESULT err=0 tag=101 nentries=1
etime=0
[16/Jun/2015:10:18:27 -0400] conn=8 op=5 EXT
oid="2.16.840.1.113730.3.5.12" name="replication-multimaster-extop"
[16/Jun/2015:10:18:27 -0400] conn=8 op=5 RESULT err=0 tag=120 nentries=0
etime=0
[16/Jun/2015:10:18:27 -0400] conn=8 op=6 DEL
dn="uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net"
[16/Jun/2015:10:18:35 -0400] conn=8 op=6 RESULT err=1 tag=107 nentries=0
etime=8 csn=55802fcf000300040000
because of DB failures to update.
The failures were E_AGAIN or E_DB_DEADLOCK. In such situation, DS
retries after a small delay.
The problem is that it retried 50 times without success.
[16/Jun/2015:10:18:34 -0400] NSMMReplicationPlugin - changelog program -
_cl5WriteOperationTxn: retry (49) the transaction
(csn=55802fcf000300040000) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK:
Locker killed to resolve a deadlock))
[16/Jun/2015:10:18:34 -0400] NSMMReplicationPlugin - changelog program -
_cl5WriteOperationTxn: failed to write entry with csn
(55802fcf000300040000); db error - -30993 BDB0068 DB_LOCK_DEADLOCK:
Locker killed to resolve a deadlock
[16/Jun/2015:10:18:34 -0400] NSMMReplicationPlugin -
write_changelog_and_ruv: can't add a change for
uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net (uniqid:
b8242e18-143111e5-b1d0d0c3-ae5854ff, optype: 32) to changelog csn
55802fcf000300040000
[16/Jun/2015:10:18:34 -0400] - SLAPI_PLUGIN_BE_TXN_POST_DELETE_FN plugin
returned error code but did not set SLAPI_RESULT_CODE
The MAIN issue here is that replica2 successfully applied others updates
after 55802fcf000300040000 from the same replica (e.g
csn=55802fcf000400040000)
I do not know if master was able to detect this failure and to replay
this update. but I am afraid it did not !!
It is looking like you hit https://fedorahosted.org/389/ticket/47788
Is it possible to access your VM ?
[16/Jun/2015:10:18:27 -0400] conn=8 op=6 DEL
dn="uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net"
[16/Jun/2015:10:18:35 -0400] conn=8 op=6 RESULT err=1 tag=107 nentries=0
etime=8 csn=55802fcf000300040000
[16/Jun/2015:10:18:35 -0400] conn=8 op=7 MOD
dn="cn=ipausers,cn=groups,cn=accounts,dc=bagam,dc=net"
[16/Jun/2015:10:18:36 -0400] conn=8 op=7 RESULT err=0 tag=103 nentries=0
etime=1 csn=55802fcf000400040000
[16/Jun/2015:10:18:36 -0400] conn=8 op=8 DEL
dn="cn=onmaster,cn=groups,cn=accounts,dc=bagam,dc=net"
[16/Jun/2015:10:18:37 -0400] conn=8 op=8 RESULT err=0 tag=107 nentries=0
etime=1 csn=55802fcf000700040000
[16/Jun/2015:10:18:37 -0400] conn=8 op=9 MOD
dn="cn=ipausers,cn=groups,cn=accounts,dc=bagam,dc=net"
[16/Jun/2015:10:18:37 -0400] conn=8 op=9 RESULT err=0 tag=103 nentries=0
etime=0 csn=55802fd0000000060000
On 06/16/2015 04:49 PM, Oleg Fayans wrote:
> Hi all,
>
> I've bumped into a strange problem with only a part of changes
> implemented on master during replica outage get replicated after
> replica recovery.
>
> Namely: when I delete an existing user on the master while the node is
> offline, these changes do not get to the node when it's back online.
> User creation, however, gets replicated as expected.
>
> Steps to reproduce:
>
> 1. Create the following tolopogy:
>
> replica1 <-> master <-> replica2 <-> replica3
>
> 2. Create user1 on master, make sure it appears on all replicas
> 3. Turn off replica2
> 4. On master delete user1 and create user2, make sure the changes get
> replicated to replica1
> 5. Turn on replica2
>
> Expected results:
>
> A minute or so after repica2 is back up,
> 1. user1 does not exist neither on replica2 nor on replica3
> 2. user2 exists both on replica2 and replica3
>
> Actual results:
> 1. user1 coexist with user2 on replica2 and replica3
> 2. master and replica1 have only user2
>
>
> In my case, though, the topology was as follows:
> $ ipa topologysegment-find realm
> ------------------
> 3 segments matched
> ------------------
> Segment name: f22master.bagam.net-to-f22replica3.bagam.net
> Left node: f22master.bagam.net
> Right node: f22replica3.bagam.net
> Connectivity: both
>
> Segment name: replica1-to-replica2
> Left node: f22replica1.bagam.net
> Right node: f22replica2.bagam.net
> Connectivity: both
>
> Segment name: replica2-to-master
> Left node: f22replica2.bagam.net
> Right node: f22master.bagam.net
> Connectivity: both
> ----------------------------
> Number of entries returned 3
> ----------------------------
> And I was turning off replica2, leaving replica1 offline, but that
> does not really matter.
>
> The dirsrv error message, most likely to be relevant is:
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
> Consumer failed to replay change (uniqueid
> b8242e18-143111e5-b1d0d0c3-ae5854ff, CSN 55802fcf000300040000):
> Operations error (1). Will retry later
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> I attach dirsrv error and access logs from all nodes, in case they
> could be useful
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/freeipa-devel/attachments/20150616/a3ca8366/attachment.htm>
More information about the Freeipa-devel
mailing list