<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 04/05/2014 09:35 AM, Nevada Sanchez
wrote:<br>
</div>
<blockquote
cite="mid:CAPUVn2tz4Y-0SpBscCgzWsevPihOahLBKWjHiEG=e9thfjAdeg@mail.gmail.com"
type="cite">
<div dir="ltr">Thanks. I added /var/log/messages to the gist (<a
moz-do-not-send="true"
href="https://gist.github.com/nevsan/8b6f78d7396963dc5f70%29--no">https://gist.github.com/nevsan/8b6f78d7396963dc5f70)--no</a>
segfaults it seems. Any other kind of disorderly shutdowns that
might happen? I'll look into creating a ticket for this.</div>
</blockquote>
<br>
Only if there is some sort of build issue that causes an undefined
symbol reference at runtime - that would cause the process to exit
without a core.<br>
<br>
<blockquote
cite="mid:CAPUVn2tz4Y-0SpBscCgzWsevPihOahLBKWjHiEG=e9thfjAdeg@mail.gmail.com"
type="cite">
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Fri, Apr 4, 2014 at 9:16 PM, Rich
Megginson <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:rmeggins@redhat.com" target="_blank">rmeggins@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div class="">
<div>On 04/03/2014 10:25 PM, Nevada Sanchez wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">I followed the instructions that would
give me a core dump, and for some reason, I don't
see one in /var/log/dirsrv/slapd-EXAMPLE-COM/, even
though I still see the Disorderly shutdown still
shows up in the logs.</div>
</blockquote>
<br>
</div>
Hmm - check again - it should produce a core file<br>
<br>
grep -i segfault /var/log/messages
<div class=""><br>
<br>
<blockquote type="cite">
<div dir="ltr">I know that when I explicitly request
those attributes, I get "<span
style="font-family:arial,sans-serif;font-size:13px">-1
Total update abortedLDAP error: Can't contact L</span><span
style="font-family:arial,sans-serif;font-size:13px">DAP server" for
nds5ReplicaLastInitStatus (see below). Access logs
stop completely on the replica after the time that
you mentioned.</span></div>
</blockquote>
<br>
</div>
Hmm - looks like a bug. Please open a ticket.
<div>
<div class="h5"><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div> <br>
</div>
<div>======================================================</div>
<div>
<div
style="font-family:arial,sans-serif;font-size:13px">[root@ipa2
ipaserver]# ldapsearch <a
moz-do-not-send="true">ldaps://</a><a
moz-do-not-send="true"
href="http://ipa.example.com:636/"
target="_blank">ipa.example.com:636</a> -D
'cn=Directory Manager' -w ##### -b 'cn=<a
moz-do-not-send="true"
href="http://metoipa2.example.com/"
target="_blank">meToipa2.example.com</a>,cn=replica,cn=dc\=example\,dc\=com,cn=mapping
tree,cn=config' '(objectClass=*)' -s base
nsds5ReplicaLastInitStart
nsds5replicaUpdateInProgress
nsds5ReplicaLastInitStatus cn
nsds5BeginReplicaRefresh
nsds5ReplicaLastInitEnd</div>
<div
style="font-family:arial,sans-serif;font-size:13px">#
extended LDIF</div>
<div
style="font-family:arial,sans-serif;font-size:13px">#</div>
<div
style="font-family:arial,sans-serif;font-size:13px">#
LDAPv3</div>
<div
style="font-family:arial,sans-serif;font-size:13px">
# base <cn=<a moz-do-not-send="true"
href="http://metoipa2.example.com/"
target="_blank">meToipa2.example.com</a>,cn=replica,cn=dc\=example\,dc\=com,cn=mapping
tree,cn=config> with scope baseObject</div>
<div
style="font-family:arial,sans-serif;font-size:13px">
# filter: (objectclass=*)</div>
<div
style="font-family:arial,sans-serif;font-size:13px">#
requesting: <a moz-do-not-send="true">ldaps://</a><a
moz-do-not-send="true"
href="http://ipa.example.com:636/"
target="_blank">ipa.example.com:636</a> (objectClass=*)
nsds5ReplicaLastInitStart
nsds5replicaUpdateInProgress
nsds5ReplicaLastInitStatus cn
nsds5BeginReplicaRefresh
nsds5ReplicaLastInitEnd </div>
<div
style="font-family:arial,sans-serif;font-size:13px">#</div>
<div
style="font-family:arial,sans-serif;font-size:13px"><br>
</div>
<div
style="font-family:arial,sans-serif;font-size:13px"># <a
moz-do-not-send="true"
href="http://metoipa2.example.com/"
target="_blank">meToipa2.example.com</a>,
replica, dc\3Dexample\2Cdc\3Dcom,</div>
<div
style="font-family:arial,sans-serif;font-size:13px">
mapping tree, config</div>
<div
style="font-family:arial,sans-serif;font-size:13px">dn:
cn=<a moz-do-not-send="true"
href="http://metoipa2.example.com/"
target="_blank">meToipa2.example.com</a>,cn=replica,cn=dc\3Dexample\2Cd</div>
<div
style="font-family:arial,sans-serif;font-size:13px"> c\3Dcom,cn=mapping
tree,cn=config</div>
<div
style="font-family:arial,sans-serif;font-size:13px">nsds5ReplicaLastInitStart:
20140401092800Z</div>
<div
style="font-family:arial,sans-serif;font-size:13px">
nsds5replicaUpdateInProgress: FALSE</div>
<div
style="font-family:arial,sans-serif;font-size:13px">nsds5ReplicaLastInitStatus:
-1 Total update abortedLDAP error: Can't
contact L</div>
<div
style="font-family:arial,sans-serif;font-size:13px">
DAP server</div>
<div
style="font-family:arial,sans-serif;font-size:13px">cn: <a
moz-do-not-send="true"
href="http://metoipa2.example.com/"
target="_blank">meToipa2.example.com</a></div>
<div
style="font-family:arial,sans-serif;font-size:13px">
nsds5ReplicaLastInitEnd: 20140401092804Z</div>
<div
style="font-family:arial,sans-serif;font-size:13px"><br>
</div>
<div
style="font-family:arial,sans-serif;font-size:13px">#
search result</div>
<div
style="font-family:arial,sans-serif;font-size:13px">search:
2</div>
<div
style="font-family:arial,sans-serif;font-size:13px">
result: 0 Success</div>
<div
style="font-family:arial,sans-serif;font-size:13px"><br>
</div>
<div
style="font-family:arial,sans-serif;font-size:13px">#
numResponses: 2</div>
<div
style="font-family:arial,sans-serif;font-size:13px">
# numEntries: 1</div>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Thu, Apr 3, 2014 at
6:32 PM, Rich Megginson <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:rmeggins@redhat.com"
target="_blank">rmeggins@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0
0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div>
<div>On 04/03/2014 03:46 PM, Nevada
Sanchez wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Okay, I updated the gist
and extended some of the logs
(ipa2-errors does stop at 20:50:21).
I'll follow up when I have the debug
stuff in place.
<div><br>
</div>
<div><a moz-do-not-send="true"
href="https://gist.github.com/nevsan/8b6f78d7396963dc5f70"
target="_blank">https://gist.github.com/nevsan/8b6f78d7396963dc5f70</a><br>
</div>
</div>
</blockquote>
<br>
</div>
Another strange thing - it looks as if the
initial replica init completes successfully.<br>
<br>
[02/Apr/2014:20:50:18 +0000]
NSMMReplicationPlugin - Beginning total
update of replica "agmt="cn=<a
moz-do-not-send="true"
href="http://meToipa2.example.com"
target="_blank">meToipa2.example.com</a>"
(ipa2:389)".<br>
<br>
On the replica:<br>
<br>
[02/Apr/2014:20:50:18 +0000]
NSMMReplicationPlugin -
multimaster_be_state_change: replica
dc=example,dc=com is going offline;
disabling replication<br>
[02/Apr/2014:20:50:18 +0000] - WARNING:
Import is running with
nsslapd-db-private-import-mem on; No other
process is allowed to access the database<br>
[02/Apr/2014:20:50:21 +0000] - import
userRoot: Workers finished; cleaning up...<br>
[02/Apr/2014:20:50:21 +0000] - import
userRoot: Workers cleaned up.<br>
[02/Apr/2014:20:50:21 +0000] - import
userRoot: Indexing complete.
Post-processing...<br>
[02/Apr/2014:20:50:21 +0000] - import
userRoot: Generating numSubordinates
complete.<br>
[02/Apr/2014:20:50:21 +0000] - import
userRoot: Flushing caches...<br>
[02/Apr/2014:20:50:21 +0000] - import
userRoot: Closing files...<br>
[02/Apr/2014:20:50:21 +0000] - import
userRoot: Import complete. Processed 453
entries in 3 seconds. (151.00 entries/sec)<br>
[02/Apr/2014:20:50:21 +0000]
NSMMReplicationPlugin -
multimaster_be_state_change: replica
dc=example,dc=com is coming online; enabling
replication<br>
<br>
On the master, access log:<br>
<br>
[02/Apr/2014:20:50:17 +0000] conn=1365 op=15
MOD dn="cn=<a moz-do-not-send="true"
href="http://meToipa2.example.com"
target="_blank">meToipa2.example.com</a>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
tree,cn=config"<br>
<br>
This is the operation that triggers the
replica init. Then ipa-replica-install
polls for agreement status:<br>
[02/Apr/2014:20:50:19 +0000] conn=1365 op=16
SRCH base="cn=<a moz-do-not-send="true"
href="http://meToipa2.example.com"
target="_blank">meToipa2.example.com</a>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
tree,cn=config" scope=0
filter="(objectClass=*)"
attrs="nsds5replicaLastInitStart
nsds5replicaUpdateInProgress
nsds5replicaLastInitStatus cn
nsds5BeginReplicaRefresh
nsds5replicaLastInitEnd"<br>
[02/Apr/2014:20:50:19 +0000] conn=1365 op=16
RESULT err=0 tag=101 nentries=1 etime=0<br>
[02/Apr/2014:20:50:20 +0000] conn=1365 op=17
SRCH base="cn=<a moz-do-not-send="true"
href="http://meToipa2.example.com"
target="_blank">meToipa2.example.com</a>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
tree,cn=config" scope=0
filter="(objectClass=*)"
attrs="nsds5replicaLastInitStart
nsds5replicaUpdateInProgress
nsds5replicaLastInitStatus cn
nsds5BeginReplicaRefresh
nsds5replicaLastInitEnd"<br>
[02/Apr/2014:20:50:20 +0000] conn=1365 op=17
RESULT err=0 tag=101 nentries=1 etime=0<br>
[02/Apr/2014:20:50:21 +0000] conn=1365 op=18
SRCH base="cn=<a moz-do-not-send="true"
href="http://meToipa2.example.com"
target="_blank">meToipa2.example.com</a>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
tree,cn=config" scope=0
filter="(objectClass=*)"
attrs="nsds5replicaLastInitStart
nsds5replicaUpdateInProgress
nsds5replicaLastInitStatus cn
nsds5BeginReplicaRefresh
nsds5replicaLastInitEnd"<br>
[02/Apr/2014:20:50:21 +0000] conn=1365 op=18
RESULT err=0 tag=101 nentries=1 etime=0<br>
[02/Apr/2014:20:50:22 +0000] conn=1365 op=19
SRCH base="cn=<a moz-do-not-send="true"
href="http://meToipa2.example.com"
target="_blank">meToipa2.example.com</a>,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping
tree,cn=config" scope=0
filter="(objectClass=*)"
attrs="nsds5replicaLastInitStart
nsds5replicaUpdateInProgress
nsds5replicaLastInitStatus cn
nsds5BeginReplicaRefresh
nsds5replicaLastInitEnd"<br>
[02/Apr/2014:20:50:22 +0000] conn=1365 op=19
RESULT err=0 tag=101 nentries=1 etime=1<br>
<br>
Something happens here. The replica init is
done, according to the replica error log.
We don't have the replica access log from
around this time to see exactly when the
connection was closed, but looking at the
ipa code, it would appear that ipa did not
see a status of "Total update succeeded".
Not sure why the master would not have
reported that, unless there was some problem
getting back the status from the replica.<br>
<br>
[02/Apr/2014:20:50:22 +0000] conn=1365 op=20
UNBIND<br>
[02/Apr/2014:20:50:22 +0000] conn=1365 op=20
fd=114 closed - U1<br>
<br>
Then ipa-replica-install closes the
connection and reports the error.
<div>
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div> </div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Thu, Apr
3, 2014 at 10:38 AM, Rich
Megginson <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:rmeggins@redhat.com"
target="_blank">rmeggins@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div text="#000000"
bgcolor="#FFFFFF">
<div>
<div>On 04/02/2014 09:22 PM,
Nevada Sanchez wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Okay.
Updated the gist with
the additional logs: <a
moz-do-not-send="true"
href="https://gist.github.com/nevsan/8b6f78d7396963dc5f70"
target="_blank">https://gist.github.com/nevsan/8b6f78d7396963dc5f70</a></div>
<div class="gmail_extra"><br>
<br>
</div>
</blockquote>
<br>
</div>
1) Dirsrv is crashing:<br>
[02/Apr/2014:20:49:53 +0000] -
389-Directory/1.3.1.22.a1
B2014.073.1751 starting up<br>
[02/Apr/2014:20:49:54 +0000] -
Db home directory is not set.
Possibly nsslapd-directory
(optionally
nsslapd-db-home-directory) is
missing in the config file.<br>
[02/Apr/2014:20:49:54 +0000] -
I'm resizing my cache
now...cache was 710029312 and
is now 8000000<br>
[02/Apr/2014:20:49:54 +0000] -
389-Directory/1.3.1.22.a1
B2014.073.1751 starting up<br>
[02/Apr/2014:20:49:54 +0000] -
Detected Disorderly Shutdown
last time Directory Server was
running, recovering database.<br>
[02/Apr/2014:20:49:55 +0000] -
slapd started. Listening on
All Interfaces port 389 for
LDAP requests<br>
<br>
Please use the instructions at
<a moz-do-not-send="true"
href="http://port389.org/wiki/FAQ#Debugging_Crashes"
target="_blank">http://port389.org/wiki/FAQ#Debugging_Crashes</a>
to get a core dump and stack
trace.<br>
<br>
2) The first occurrence of the
connection error is at
[02/Apr/2014:20:52:38 +0000]
but there isn't anything in
the consumer error log after
[02/Apr/2014:20:50:21 +0000]
and in the consumer access log
after [02/Apr/2014:20:50:22
+0000]
<div>
<div><br>
<br>
<blockquote type="cite">
<div class="gmail_extra">
<div
class="gmail_quote">
On Wed, Apr 2, 2014
at 9:38 PM, Rich
Megginson <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:rmeggins@redhat.com" target="_blank">rmeggins@redhat.com</a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0 0
0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div
text="#000000"
bgcolor="#FFFFFF">
<div>
<div>On
04/02/2014
03:01 PM,
Nevada Sanchez
wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">Okay,
I ran it with
debug on. The
output is
quite large.
I'm not sure
what the
etiquette is
for posting
large logs, so
I threw it on
gist here: <a
moz-do-not-send="true"
href="http://gist.githubusercontent.com/nevsan/8b6f78d7396963dc5f70/raw/b76b3c3acce4f12d292d680f4c1dab39c05888d5/gistfile1.txt"
target="_blank">https://gist.githubusercontent.com/nevsan/8b6f78d7396963dc5f70/raw/b76b3c3acce4f12d292d680f4c1dab39c05888d5/gistfile1.txt</a>
<div> <br>
</div>
<div>Let me
know if I
should copy it
into the
thread
instead.</div>
</div>
</blockquote>
<br>
</div>
Ok. Now can you
post excerpts
from the dirsrv
errors log from
both the master
replica and the
replica from
around the time
of the failure?
<div>
<div><br>
<br>
<blockquote
type="cite">
<div
class="gmail_extra"><br>
<br>
<div
class="gmail_quote">On
Wed, Apr 2,
2014 at 1:49
PM, Rich
Megginson <span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:rmeggins@redhat.com" target="_blank">rmeggins@redhat.com</a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div
text="#000000"
bgcolor="#FFFFFF">
<div>
<div>On
04/02/2014
11:45 AM,
Nevada Sanchez
wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">My
apologies. I
mistakenly ran
the failing
ldapsearch
from an
unpriviliged
user (couldn't
read
slapd-EXAMPLE-COM
directory).
Running as
root, it now
works just
fine (same
result as the
one that
worked). SSL
seems to not
be the issue.
Also, I
haven't change
the SSL certs
since I first
set up the
master.<br>
<div><br>
</div>
<div>I have
been doing the
replica side
things from
scratch (even
so far as
starting with
a new
machine). For
the master
side, I have
just been
re-preparing
the replica. I
hope I don't
have to start
from scratch
with the
master
replica.</div>
</div>
</blockquote>
<br>
</div>
I guess the
next step
would be to do
the
ipa-replica-install
using -ddd and
review the
extra debug
information
that comes
out.
<div>
<div><br>
<br>
<blockquote
type="cite">
<div dir="ltr">
</div>
<div
class="gmail_extra"><br>
<br>
<div
class="gmail_quote">On
Wed, Apr 2,
2014 at 11:45
AM, Rob
Crittenden <span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:rcritten@redhat.com" target="_blank">rcritten@redhat.com</a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">Rich
Megginson
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div> On
04/02/2014
09:20 AM,
Nevada Sanchez
wrote:<br>
</div>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div> Okay, we
might be on to
something:<br>
<br>
ipa -> ipa2<br>
================================<br>
$
LDAPTLS_CACERTDIR=/etc/dirsrv/slapd-EXAMPLE-COM
ldapsearch
-xLLLZZ<br>
</div>
-h <a
moz-do-not-send="true"
href="http://ipa2.example.com" target="_blank">ipa2.example.com</a> <<a
moz-do-not-send="true" href="http://ipa2.example.com" target="_blank">http://ipa2.example.com</a>>
-s base -b ""
<div><br>
'objectclass=*'
vendorVersion<br>
dn:<br>
vendorVersion:
389-Directory/1.3.1.22.a1
B2014.073.1751<br>
================================<br>
<br>
ipa2 -> ipa<br>
================================<br>
$
LDAPTLS_CACERTDIR=/etc/dirsrv/slapd-EXAMPLE-COM
ldapsearch
-xLLLZZ<br>
</div>
-h <a
moz-do-not-send="true"
href="http://ipa.example.com" target="_blank">ipa.example.com</a> <<a
moz-do-not-send="true" href="http://ipa.example.com" target="_blank">http://ipa.example.com</a>>
-s base -b ""
<div>
<div><br>
'objectclass=*'
vendorVersion<br>
ldap_start_tls:
Connect error
(-11)<br>
additional
info: TLS
error
-8172:Peer's
certificate
issuer has
been<br>
marked as not
trusted by the
user.<br>
================================<br>
<br>
The original
IPA trusts the
replica (since
it signed the
cert, I<br>
assume), but
the replica
doesn't trust
the main IPA
server. I
guess<br>
the ZZ option
would have
shown me the
failure that I
missed in my<br>
initial
ldapsearch
tests.<br>
</div>
</div>
</blockquote>
<div>
<div>
-Z[Z] Issue
StartTLS
(Transport
Layer
Security)
extended<br>
operation. If<br>
you use
-ZZ, the
command will
require the
operation to<br>
be suc-<br>
cessful.<br>
<br>
i.e. use SSL,
and force a
successful
handshake<br>
<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<br>
Anyway, what's
the best way
to remedy this
in a way that
makes IPA<br>
happy? (I've
found that
LDAP can have
different
requirements
on which<br>
certs go
where).<br>
</blockquote>
<br>
I'm not sure.
ipa-server-install/ipa-replica-prepare/ipa-replica-install<br>
is supposed to
take care of
installing the
CA cert
properly for
you. If<br>
you try to
hack it and
install the CA
cert manually,
you will
probably<br>
miss something
else that ipa
install did
not do.<br>
<br>
I think the
only way to
ensure that
you have a
properly
configured ipa<br>
server +
replicas is to
get all of the
ipa commands
completing
successfully.<br>
<br>
Which means
going back to
the drawing
board and
starting over
from scratch.<br>
</div>
</div>
</blockquote>
<br>
You can
compare the
certs that
each side is
using with:<br>
<br>
# certutil -L
-d
/etc/dirsrv/slapd-EXAMPLE-COM<br>
<br>
Did you by
chance replace
the SSL server
certs that IPA
uses on your
working
master?<span><font
color="#888888"><br>
<br>
rob<br>
</font></span></blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</body>
</html>