<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hi Ludwig,<br>
<br>
<div class="moz-cite-prefix">On 06/17/2015 11:06 AM, Ludwig Krispenz
wrote:<br>
</div>
<blockquote cite="mid:558138B1.1070809@redhat.com" type="cite">
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
Hi Oleg,<br>
<br>
can you give a bit more info on the scenarios when this happens.
Always or is it a timing problem ? <br>
</blockquote>
I guess it is a timing problem. It happened yesterday, today I was
unable to reproduce this. The scenario is very simple:<br>
create a user1, make sure it's there turn off a replica, then create
another user on master and delete user1 on master, then turn replica
back on.<br>
I still have an infrastructure with 2 replicas having a user that
was deleted on master. Now all the user (and other data)
manipulations on this very setup work as intended.<br>
<blockquote cite="mid:558138B1.1070809@redhat.com" type="cite"> <br>
Ludwig<br>
<br>
<div class="moz-cite-prefix">On 06/16/2015 07:02 PM, thierry
bordaz wrote:<br>
</div>
<blockquote cite="mid:5580568A.2000404@redhat.com" type="cite">
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
<div class="moz-cite-prefix">Hello<br>
<br>
<br>
On Master:<br>
User 'onmaster' was deleted<br>
<br>
[16/Jun/2015:10:16:45 -0400] conn=402 op=19 SRCH
base="cn=otp,dc=bagam,dc=net" scope=1
filter="(&(objectClass=ipatoken)(ipatokenOwner=uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net))"
attrs="ipatokenNotAfter description ipatokenOwner objectClass
ipatokenDisabled ipatokenVendor managedBy ipatokenModel
ipatokenNotBefore ipatokenUniqueID ipatokenSerial"<br>
[16/Jun/2015:10:16:45 -0400] conn=402 op=19 RESULT err=0
tag=101 nentries=0 etime=0<br>
[16/Jun/2015:10:16:45 -0400] conn=402 op=20 DEL
dn="uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net"<br>
[16/Jun/2015:10:16:45 -0400] conn=402 op=21 UNBIND<br>
[16/Jun/2015:10:16:45 -0400] conn=402 op=21 fd=120 closed - U1<br>
[16/Jun/2015:10:16:45 -0400] conn=402 op=20 RESULT err=0
tag=107 nentries=0 etime=0 csn=55802fcf000300040000<br>
<br>
Replication agreement failed to replicate it to the
replica2<br>
[16/Jun/2015:10:18:36 -0400] NSMMReplicationPlugin -
agmt="cn=f22master.bagam.net-to-f22replica2.bagam.net"
(f22replica2:389): Consumer failed to replay change (uniqueid
b8242e18-143111e5-b1d0d0c3-ae5854ff, CSN
55802fcf000300040000): Operations error (1). Will retry later.<br>
<br>
<br>
On replica2:<br>
<br>
The replicated operation failed<br>
[16/Jun/2015:10:18:27 -0400] conn=8 op=4 RESULT err=0 tag=101
nentries=1 etime=0<br>
[16/Jun/2015:10:18:27 -0400] conn=8 op=5 EXT
oid="2.16.840.1.113730.3.5.12"
name="replication-multimaster-extop"<br>
[16/Jun/2015:10:18:27 -0400] conn=8 op=5 RESULT err=0 tag=120
nentries=0 etime=0<br>
[16/Jun/2015:10:18:27 -0400] conn=8 op=6 DEL
dn="uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net"<br>
[16/Jun/2015:10:18:35 -0400] conn=8 op=6 RESULT err=1 tag=107
nentries=0 etime=8 csn=55802fcf000300040000<br>
<br>
because of DB failures to update.<br>
The failures were E_AGAIN or E_DB_DEADLOCK. In such
situation, DS retries after a small delay.<br>
The problem is that it retried 50 times without success.<br>
[16/Jun/2015:10:18:34 -0400] NSMMReplicationPlugin - changelog
program - _cl5WriteOperationTxn: retry (49) the transaction
(csn=55802fcf000300040000) failed (rc=-30993 (BDB0068
DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock))<br>
[16/Jun/2015:10:18:34 -0400] NSMMReplicationPlugin - changelog
program - _cl5WriteOperationTxn: failed to write entry with
csn (55802fcf000300040000); db error - -30993 BDB0068
DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock<br>
[16/Jun/2015:10:18:34 -0400] NSMMReplicationPlugin -
write_changelog_and_ruv: can't add a change for
uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net (uniqid:
b8242e18-143111e5-b1d0d0c3-ae5854ff, optype: 32) to changelog
csn 55802fcf000300040000<br>
[16/Jun/2015:10:18:34 -0400] -
SLAPI_PLUGIN_BE_TXN_POST_DELETE_FN plugin returned error code
but did not set SLAPI_RESULT_CODE<br>
<br>
<br>
The MAIN issue here is that replica2 successfully applied
others updates after 55802fcf000300040000 from the same
replica (e.g csn=55802fcf000400040000)<br>
I do not know if master was able to detect this failure and to
replay this update. but I am afraid it did not !!<br>
It is looking like you hit <a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="https://fedorahosted.org/389/ticket/47788">https://fedorahosted.org/389/ticket/47788</a><br>
Is it possible to access your VM ?<br>
<br>
[16/Jun/2015:10:18:27 -0400] conn=8 op=6 DEL
dn="uid=onmaster,cn=users,cn=accounts,dc=bagam,dc=net"<br>
[16/Jun/2015:10:18:35 -0400] conn=8 op=6 RESULT err=1 tag=107
nentries=0 etime=8 csn=55802fcf000300040000<br>
[16/Jun/2015:10:18:35 -0400] conn=8 op=7 MOD
dn="cn=ipausers,cn=groups,cn=accounts,dc=bagam,dc=net"<br>
[16/Jun/2015:10:18:36 -0400] conn=8 op=7 RESULT err=0 tag=103
nentries=0 etime=1 csn=55802fcf000400040000<br>
[16/Jun/2015:10:18:36 -0400] conn=8 op=8 DEL
dn="cn=onmaster,cn=groups,cn=accounts,dc=bagam,dc=net"<br>
[16/Jun/2015:10:18:37 -0400] conn=8 op=8 RESULT err=0 tag=107
nentries=0 etime=1 csn=55802fcf000700040000<br>
[16/Jun/2015:10:18:37 -0400] conn=8 op=9 MOD
dn="cn=ipausers,cn=groups,cn=accounts,dc=bagam,dc=net"<br>
[16/Jun/2015:10:18:37 -0400] conn=8 op=9 RESULT err=0 tag=103
nentries=0 etime=0 csn=55802fd0000000060000<br>
<br>
<br>
<br>
<br>
On 06/16/2015 04:49 PM, Oleg Fayans wrote:<br>
</div>
<blockquote cite="mid:55803767.5080906@redhat.com" type="cite">Hi
all, <br>
<br>
I've bumped into a strange problem with only a part of changes
implemented on master during replica outage get replicated
after replica recovery. <br>
<br>
Namely: when I delete an existing user on the master while the
node is offline, these changes do not get to the node when
it's back online. User creation, however, gets replicated as
expected. <br>
<br>
Steps to reproduce: <br>
<br>
1. Create the following tolopogy: <br>
<br>
replica1 <-> master <-> replica2 <->
replica3 <br>
<br>
2. Create user1 on master, make sure it appears on all
replicas <br>
3. Turn off replica2 <br>
4. On master delete user1 and create user2, make sure the
changes get replicated to replica1 <br>
5. Turn on replica2 <br>
<br>
Expected results: <br>
<br>
A minute or so after repica2 is back up, <br>
1. user1 does not exist neither on replica2 nor on replica3 <br>
2. user2 exists both on replica2 and replica3 <br>
<br>
Actual results: <br>
1. user1 coexist with user2 on replica2 and replica3 <br>
2. master and replica1 have only user2 <br>
<br>
<br>
In my case, though, the topology was as follows: <br>
$ ipa topologysegment-find realm <br>
------------------ <br>
3 segments matched <br>
------------------ <br>
Segment name: f22master.bagam.net-to-f22replica3.bagam.net <br>
Left node: f22master.bagam.net <br>
Right node: f22replica3.bagam.net <br>
Connectivity: both <br>
<br>
Segment name: replica1-to-replica2 <br>
Left node: f22replica1.bagam.net <br>
Right node: f22replica2.bagam.net <br>
Connectivity: both <br>
<br>
Segment name: replica2-to-master <br>
Left node: f22replica2.bagam.net <br>
Right node: f22master.bagam.net <br>
Connectivity: both <br>
---------------------------- <br>
Number of entries returned 3 <br>
---------------------------- <br>
And I was turning off replica2, leaving replica1 offline, but
that does not really matter. <br>
<br>
The dirsrv error message, most likely to be relevant is: <br>
-----------------------------------------------------------------------------------------------------------------------------------------------------
<br>
Consumer failed to replay change (uniqueid
b8242e18-143111e5-b1d0d0c3-ae5854ff, CSN
55802fcf000300040000): Operations error (1). Will retry later
<br>
-----------------------------------------------------------------------------------------------------------------------------------------------------
<br>
<br>
I attach dirsrv error and access logs from all nodes, in case
they could be useful <br>
<br>
<br>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
</blockquote>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
</blockquote>
<br>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Oleg Fayans
Quality Engineer
FreeIPA team
RedHat.</pre>
</body>
</html>