<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 04/03/2014 03:46 PM, Nevada Sanchez
wrote:<br>
</div>
<blockquote
cite="mid:CAPUVn2s1kY17pUN7W4eAA4NEOfjisi2ycntq48+7qU7WpnT+eQ@mail.gmail.com"
type="cite">
<div dir="ltr">Okay, I updated the gist and extended some of the
logs (ipa2-errors does stop at 20:50:21). I'll follow up when I
have the debug stuff in place.
<div><br>
</div>
<div><a moz-do-not-send="true"
href="https://gist.github.com/nevsan/8b6f78d7396963dc5f70">https://gist.github.com/nevsan/8b6f78d7396963dc5f70</a><br>
</div>
</div>
</blockquote>
<br>
Another strange thing - it looks as if the initial replica init
completes successfully.<br>
<br>
[02/Apr/2014:20:50:18 +0000] NSMMReplicationPlugin - Beginning total
update of replica "agmt="cn=meToipa2.example.com" (ipa2:389)".<br>
<br>
On the replica:<br>
<br>
[02/Apr/2014:20:50:18 +0000] NSMMReplicationPlugin -
multimaster_be_state_change: replica dc=example,dc=com is going
offline; disabling replication<br>
[02/Apr/2014:20:50:18 +0000] - WARNING: Import is running with
nsslapd-db-private-import-mem on; No other process is allowed to
access the database<br>
[02/Apr/2014:20:50:21 +0000] - import userRoot: Workers finished;
cleaning up...<br>
[02/Apr/2014:20:50:21 +0000] - import userRoot: Workers cleaned up.<br>
[02/Apr/2014:20:50:21 +0000] - import userRoot: Indexing complete.
Post-processing...<br>
[02/Apr/2014:20:50:21 +0000] - import userRoot: Generating
numSubordinates complete.<br>
[02/Apr/2014:20:50:21 +0000] - import userRoot: Flushing caches...<br>
[02/Apr/2014:20:50:21 +0000] - import userRoot: Closing files...<br>
[02/Apr/2014:20:50:21 +0000] - import userRoot: Import complete.
Processed 453 entries in 3 seconds. (151.00 entries/sec)<br>
[02/Apr/2014:20:50:21 +0000] NSMMReplicationPlugin -
multimaster_be_state_change: replica dc=example,dc=com is coming
online; enabling replication<br>
<br>
On the master, access log:<br>
<br>
[02/Apr/2014:20:50:17 +0000] conn=1365 op=15 MOD
dn="cn=meToipa2.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
tree,cn=config"<br>
<br>
This is the operation that triggers the replica init. Then
ipa-replica-install polls for agreement status:<br>
[02/Apr/2014:20:50:19 +0000] conn=1365 op=16 SRCH
base="cn=meToipa2.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
tree,cn=config" scope=0 filter="(objectClass=*)"
attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress
nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh
nsds5replicaLastInitEnd"<br>
[02/Apr/2014:20:50:19 +0000] conn=1365 op=16 RESULT err=0 tag=101
nentries=1 etime=0<br>
[02/Apr/2014:20:50:20 +0000] conn=1365 op=17 SRCH
base="cn=meToipa2.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
tree,cn=config" scope=0 filter="(objectClass=*)"
attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress
nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh
nsds5replicaLastInitEnd"<br>
[02/Apr/2014:20:50:20 +0000] conn=1365 op=17 RESULT err=0 tag=101
nentries=1 etime=0<br>
[02/Apr/2014:20:50:21 +0000] conn=1365 op=18 SRCH
base="cn=meToipa2.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
tree,cn=config" scope=0 filter="(objectClass=*)"
attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress
nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh
nsds5replicaLastInitEnd"<br>
[02/Apr/2014:20:50:21 +0000] conn=1365 op=18 RESULT err=0 tag=101
nentries=1 etime=0<br>
[02/Apr/2014:20:50:22 +0000] conn=1365 op=19 SRCH
base="cn=meToipa2.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
tree,cn=config" scope=0 filter="(objectClass=*)"
attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress
nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh
nsds5replicaLastInitEnd"<br>
[02/Apr/2014:20:50:22 +0000] conn=1365 op=19 RESULT err=0 tag=101
nentries=1 etime=1<br>
<br>
Something happens here. The replica init is done, according to the
replica error log. We don't have the replica access log from around
this time to see exactly when the connection was closed, but looking
at the ipa code, it would appear that ipa did not see a status of
"Total update succeeded". Not sure why the master would not have
reported that, unless there was some problem getting back the status
from the replica.<br>
<br>
[02/Apr/2014:20:50:22 +0000] conn=1365 op=20 UNBIND<br>
[02/Apr/2014:20:50:22 +0000] conn=1365 op=20 fd=114 closed - U1<br>
<br>
Then ipa-replica-install closes the connection and reports the
error.<br>
<br>
<blockquote
cite="mid:CAPUVn2s1kY17pUN7W4eAA4NEOfjisi2ycntq48+7qU7WpnT+eQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Thu, Apr 3, 2014 at 10:38 AM, Rich
Megginson <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:rmeggins@redhat.com" target="_blank">rmeggins@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div class="">
<div>On 04/02/2014 09:22 PM, Nevada Sanchez wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Okay. Updated the gist with the
additional logs: <a moz-do-not-send="true"
href="https://gist.github.com/nevsan/8b6f78d7396963dc5f70"
target="_blank">https://gist.github.com/nevsan/8b6f78d7396963dc5f70</a></div>
<div class="gmail_extra"><br>
<br>
</div>
</blockquote>
<br>
</div>
1) Dirsrv is crashing:<br>
[02/Apr/2014:20:49:53 +0000] - 389-Directory/1.3.1.22.a1
B2014.073.1751 starting up<br>
[02/Apr/2014:20:49:54 +0000] - Db home directory is not
set. Possibly nsslapd-directory (optionally
nsslapd-db-home-directory) is missing in the config file.<br>
[02/Apr/2014:20:49:54 +0000] - I'm resizing my cache
now...cache was 710029312 and is now 8000000<br>
[02/Apr/2014:20:49:54 +0000] - 389-Directory/1.3.1.22.a1
B2014.073.1751 starting up<br>
[02/Apr/2014:20:49:54 +0000] - Detected Disorderly
Shutdown last time Directory Server was running,
recovering database.<br>
[02/Apr/2014:20:49:55 +0000] - slapd started. Listening on
All Interfaces port 389 for LDAP requests<br>
<br>
Please use the instructions at <a moz-do-not-send="true"
href="http://port389.org/wiki/FAQ#Debugging_Crashes"
target="_blank">http://port389.org/wiki/FAQ#Debugging_Crashes</a>
to get a core dump and stack trace.<br>
<br>
2) The first occurrence of the connection error is at
[02/Apr/2014:20:52:38 +0000] but there isn't anything in
the consumer error log after [02/Apr/2014:20:50:21 +0000]
and in the consumer access log after [02/Apr/2014:20:50:22
+0000]
<div>
<div class="h5"><br>
<br>
<blockquote type="cite">
<div class="gmail_extra">
<div class="gmail_quote"> On Wed, Apr 2, 2014 at
9:38 PM, Rich Megginson <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:rmeggins@redhat.com"
target="_blank">rmeggins@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0
0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div>
<div>On 04/02/2014 03:01 PM, Nevada
Sanchez wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Okay, I ran it with debug
on. The output is quite large. I'm not
sure what the etiquette is for posting
large logs, so I threw it on gist
here: <a moz-do-not-send="true"
href="http://gist.githubusercontent.com/nevsan/8b6f78d7396963dc5f70/raw/b76b3c3acce4f12d292d680f4c1dab39c05888d5/gistfile1.txt"
target="_blank">https://gist.githubusercontent.com/nevsan/8b6f78d7396963dc5f70/raw/b76b3c3acce4f12d292d680f4c1dab39c05888d5/gistfile1.txt</a>
<div> <br>
</div>
<div>Let me know if I should copy it
into the thread instead.</div>
</div>
</blockquote>
<br>
</div>
Ok. Now can you post excerpts from the
dirsrv errors log from both the master
replica and the replica from around the time
of the failure?
<div>
<div><br>
<br>
<blockquote type="cite">
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Wed, Apr
2, 2014 at 1:49 PM, Rich Megginson
<span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:rmeggins@redhat.com"
target="_blank">rmeggins@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div text="#000000"
bgcolor="#FFFFFF">
<div>
<div>On 04/02/2014 11:45 AM,
Nevada Sanchez wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">My
apologies. I mistakenly
ran the failing
ldapsearch from an
unpriviliged user
(couldn't read
slapd-EXAMPLE-COM
directory). Running as
root, it now works just
fine (same result as the
one that worked). SSL
seems to not be the
issue. Also, I haven't
change the SSL certs
since I first set up the
master.<br>
<div><br>
</div>
<div>I have been doing
the replica side
things from scratch
(even so far as
starting with a new
machine). For the
master side, I have
just been re-preparing
the replica. I hope I
don't have to start
from scratch with the
master replica.</div>
</div>
</blockquote>
<br>
</div>
I guess the next step would be
to do the ipa-replica-install
using -ddd and review the
extra debug information that
comes out.
<div>
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr"> </div>
<div class="gmail_extra"><br>
<br>
<div
class="gmail_quote">On
Wed, Apr 2, 2014 at
11:45 AM, Rob
Crittenden <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:rcritten@redhat.com" target="_blank">rcritten@redhat.com</a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0 0
0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">Rich
Megginson wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div> On
04/02/2014
09:20 AM,
Nevada Sanchez
wrote:<br>
</div>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div> Okay, we
might be on to
something:<br>
<br>
ipa -> ipa2<br>
================================<br>
$
LDAPTLS_CACERTDIR=/etc/dirsrv/slapd-EXAMPLE-COM
ldapsearch
-xLLLZZ<br>
</div>
-h <a
moz-do-not-send="true"
href="http://ipa2.example.com" target="_blank">ipa2.example.com</a> <<a
moz-do-not-send="true" href="http://ipa2.example.com" target="_blank">http://ipa2.example.com</a>>
-s base -b ""
<div><br>
'objectclass=*'
vendorVersion<br>
dn:<br>
vendorVersion:
389-Directory/1.3.1.22.a1
B2014.073.1751<br>
================================<br>
<br>
ipa2 -> ipa<br>
================================<br>
$
LDAPTLS_CACERTDIR=/etc/dirsrv/slapd-EXAMPLE-COM
ldapsearch
-xLLLZZ<br>
</div>
-h <a
moz-do-not-send="true"
href="http://ipa.example.com" target="_blank">ipa.example.com</a> <<a
moz-do-not-send="true" href="http://ipa.example.com" target="_blank">http://ipa.example.com</a>>
-s base -b ""
<div>
<div><br>
'objectclass=*'
vendorVersion<br>
ldap_start_tls:
Connect error
(-11)<br>
additional
info: TLS
error
-8172:Peer's
certificate
issuer has
been<br>
marked as not
trusted by the
user.<br>
================================<br>
<br>
The original
IPA trusts the
replica (since
it signed the
cert, I<br>
assume), but
the replica
doesn't trust
the main IPA
server. I
guess<br>
the ZZ option
would have
shown me the
failure that I
missed in my<br>
initial
ldapsearch
tests.<br>
</div>
</div>
</blockquote>
<div>
<div>
-Z[Z] Issue
StartTLS
(Transport
Layer
Security)
extended<br>
operation. If<br>
you use
-ZZ, the
command will
require the
operation to<br>
be suc-<br>
cessful.<br>
<br>
i.e. use SSL,
and force a
successful
handshake<br>
<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<br>
Anyway, what's
the best way
to remedy this
in a way that
makes IPA<br>
happy? (I've
found that
LDAP can have
different
requirements
on which<br>
certs go
where).<br>
</blockquote>
<br>
I'm not sure.
ipa-server-install/ipa-replica-prepare/ipa-replica-install<br>
is supposed to
take care of
installing the
CA cert
properly for
you. If<br>
you try to
hack it and
install the CA
cert manually,
you will
probably<br>
miss something
else that ipa
install did
not do.<br>
<br>
I think the
only way to
ensure that
you have a
properly
configured ipa<br>
server +
replicas is to
get all of the
ipa commands
completing
successfully.<br>
<br>
Which means
going back to
the drawing
board and
starting over
from scratch.<br>
</div>
</div>
</blockquote>
<br>
You can compare
the certs that
each side is using
with:<br>
<br>
# certutil -L -d
/etc/dirsrv/slapd-EXAMPLE-COM<br>
<br>
Did you by chance
replace the SSL
server certs that
IPA uses on your
working master?<span><font
color="#888888"><br>
<br>
rob<br>
</font></span></blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</body>
</html>