<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<br>
<div class="moz-cite-prefix">On 09/05/2016 12:05 PM, Rakesh
Rajasekharan wrote:<br>
</div>
<blockquote
cite="mid:CANAMAkoWJjgk0mDJXFRE6xJyrCjenHjZTMemjcVnT1EgEk6ZTA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>Hi Thierry,<br>
<br>
<br>
</div>
<div>I was getting the hang issue while running
ipa-client-install simultaneously on few clients..<br>
</div>
<div>However, today, I am not able to replicate that. <br>
</div>
<div><br>
</div>
<div>I could not get a gdb . But i will try getting that
the next time I face this issue.<br>
<br>
</div>
The CPU does not stay high.. it just momentarily touches
a high value and then drops down to around 2-7%<br>
<br>
</div>
One question I have is , is it ok to set it
nsslapd-threadnumber to a very high value .<br>
</div>
I have around 4000 clients and with
nsslapd-maxthreadsperconn set to 5..So, can I set
nsslapd-threadnumber to around 25000.<br>
</div>
</div>
</div>
</blockquote>
<br>
Hello,<br>
<br>
I know some users running in production with several hunderds of
threads (>600) and this without problem.<br>
<br>
I do not recall having suggested to increase that number and for
what reason.<br>
Usually 30 workers is a good enough value. It can create bootleneck
if for some reason each operation is very long to satisfy and
exhaust the number of workers. You can monitor the work queue:<br>
<blockquote><tt>ldapsearch -D "cn=directory manager" -w xxx -LLL -b
"cn=monitor" -s base opsinitiated opscompleted</tt><br>
</blockquote>
<br>
If opscompleted-opsinitiated remains close to threadnumber, then yes
it would be valuable to increase it.<br>
<br>
The computation #client * #async_op_per_client sound an overkill.
Even if all clients send at the exact same time all their requests,
it is very likely that some common resource (db page, log,
allocator...) will serialize them. If you monitor a need to increase
the work, you would for example set it to 50, then monitor, then set
it to 100, then monitor... until you find a good enough value.<br>
Note the increasing the #thread, increases the memory footprint that
will reduce the efficiency of file system cache and can increase the
response time.<br>
<br>
<br>
best regards<br>
thierry <br>
<br>
<blockquote
cite="mid:CANAMAkoWJjgk0mDJXFRE6xJyrCjenHjZTMemjcVnT1EgEk6ZTA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div><br>
</div>
Thanks<br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Mon, Sep 5, 2016 at 1:03 PM, thierry
bordaz <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:tbordaz@redhat.com" target="_blank">tbordaz@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> <br>
Hi Rakesh,<br>
<br>
Were you able to get a pstack or full stack with gdb (<a
moz-do-not-send="true"
href="http://www.port389.org/docs/389ds/FAQ/faq.html#debugging-crashes"
target="_blank"><a class="moz-txt-link-freetext" href="http://www.port389.org/docs/">http://www.port389.org/docs/</a><wbr>389ds/FAQ/faq.html#debugging-<wbr>crashes</a>)
when the server hangs ?<br>
<br>
If it happens with 500 threads as well as with 30, using
30 threads is a better choice to debug this issue.<br>
I will try to reproduce using 150 parallel 'ipa user-find
p-testipa' commands<br>
<br>
Something I am unsure is if the CPU consumption stays high
(you mentioned 340% CPU usage) as long as the hang happens
or if after a suddent shot up to 340% (that marks the
beginning of the hang) it drops and stay hanging ?<br>
<br>
thanks<br>
thierry<br>
<br>
<div>On 09/04/2016 08:40 PM, Rakesh Rajasekharan wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>starce on the slapd process actually had this in
the output.. <br>
FUTEX_WAIT_PRIVATE<br>
<br>
</div>
and checking for the number of threads slapd had..
there were 5015 threads<br>
<div><br>
ps -efL|grep slapd|wc -l<br>
5015<br>
<br>
strace on most of the threads gave this output <br>
<br>
strace -p 67411<br>
Process 67411 attached<br>
futex(0x7f3f0226b41c, FUTEX_WAIT_PRIVATE, 1, NULL) =
-1 EAGAIN (Resource temporarily unavailable)<br>
futex(0x7f3f0226b41c, FUTEX_WAIT_PRIVATE, 2,
NULL^CProcess 67411 detached<br>
<br>
<br>
<br>
<br>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Sun, Sep 4, 2016 at 5:34
PM, Rakesh Rajasekharan <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:rakesh.rajasekharan@gmail.com"
target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:rakesh.rajasekharan@gmail.com">rakesh.rajasekharan@gmail.com</a></a><wbr>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>I have again got the issue of IPA
hanging.. The issue came up when i
tried to run ipa-client-isntall on 142
clients simultaneously<br>
<br>
<br>
</div>
<div>None of the IPA commands are
responding, and I see this error<br>
<br>
ipa user-find p-testipa<br>
ipa: ERROR: Insufficient access:
SASL(-1): generic failure: GSSAPI
Error: Unspecified GSS failure. Minor
code may provide more information (KDC
returned error string: PROCESS_TGS)<br>
<br>
KRB5_TRACE=/dev/stdout kinit admin<br>
[41178] 1472984115.233214: Getting
initial credentials for <a
moz-do-not-send="true"
href="mailto:admin@XYZ.COM"
target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:admin@XYZ.COM">admin@XYZ.COM</a></a><br>
[41178] 1472984115.235257: Sending
request (167 bytes) to <a
moz-do-not-send="true"
href="http://XYZ.COM"
target="_blank">XYZ.COM</a><br>
[41178] 1472984115.235419: Initiating
TCP connection to stream <a
moz-do-not-send="true"
href="http://10.1.3.36:88"
target="_blank">10.1.3.36:88</a><br>
[41178] 1472984115.235685: Sending TCP
request to stream <a
moz-do-not-send="true"
href="http://10.1.3.36:88"
target="_blank">10.1.3.36:88</a><br>
[41178] 1472984120.238914: Received
answer (174 bytes) from stream <a
moz-do-not-send="true"
href="http://10.1.3.36:88"
target="_blank">10.1.3.36:88</a><br>
[41178] 1472984120.238925: Terminating
TCP connection to stream <a
moz-do-not-send="true"
href="http://10.1.3.36:88"
target="_blank">10.1.3.36:88</a><br>
[41178] 1472984120.238993: Response
was from master KDC<br>
</div>
<div>[41<br>
<br>
<br>
</div>
<div>Running an ldapsearch to see the
db.. does not give any results and
just hangs there<br>
<br>
ldapsearch -x -D 'cn=Directory
Manager' -W -s one -b
'cn=kerberos,dc=xyz,dc=com'<br>
Enter LDAP Password:<br>
<br>
</div>
<div>even an ldapsearch -x does not
respond<br>
</div>
<div>At this point, am sure that slapd
is the one causing issues<br>
</div>
<div><br>
</div>
<div>Running an strace against the hung
slapd itself seems to get stuck does
not proceed after saying "attaching to
process"<br>
<br>
</div>
<div>From some others posts I read
Thierry suggesting to increase the
nsslapd-threadnumber value<br>
<br>
</div>
<div>It was set to 30, I think that
might be too low.<br>
<br>
</div>
<div>I have raised it to 500<br>
</div>
<br>
</div>
<div>Now after restarting the service ..
ldapsearch starts responding.<br>
</div>
But running the test to add a sudden high
number of clients again left ns-slapd to
hung state<br>
<br>
</div>
When i attempted adding the clients.. the
ns-slapd cpu usage shot up to 340% and after
that ns-slapd stopped responding<br>
<br>
</div>
So now, atleast I know what might be causing
the issue and I can now easily reproduce it.<br>
<br>
</div>
<div>Is there a way I can make ns-slapd handle a
sudden bump in incoming request for
ipa-client-install<br>
<br>
</div>
<div>Thanks<span><font color="#888888"><br>
</font></span></div>
<span><font color="#888888">
<div>Rakesh<br>
</div>
<br>
<div>
<div>
<div>
<div>
<div>
<div><br>
</div>
<div><br>
<br>
<br>
</div>
</div>
</div>
</div>
</div>
</div>
</font></span></div>
<div class="gmail_extra"><br>
<div class="gmail_quote">
<div>
<div>On Mon, Aug 29, 2016 at 11:18 PM, Rich
Megginson <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:rmeggins@redhat.com"
target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:rmeggins@redhat.com">rmeggins@redhat.com</a></a>></span>
wrote:<br>
</div>
</div>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex">
<div>
<div>
<div bgcolor="#FFFFFF" text="#000000"><span>
<div>On 08/29/2016 10:53 AM, Rakesh
Rajasekharan wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>Hi Thierry,<br>
<br>
My machine has 30GB RAM
..and 389-ds version is
1.3.4<br>
<br>
</div>
ldapsearch shows the values
for nsslapd-cachememsize
updated to 200MB.<br>
<br>
ldapsearch -LLL -o
ldif-wrap=no -D "cn=directory
manager" -w 'mypassword' -b
'cn=userRoot,cn=ldbm
database,cn=plugins,cn=config'<wbr>|grep
nsslapd-cachememsize<br>
nsslapd-cachememsize:
209715200<br>
<br>
<br>
So, it seems to have updated
though seeing that
warning(WARNING: ipaca: entry
cache size 10485760B is less
than db size 11599872B) in the
log confuses me a bit.<br>
<br>
</div>
<div>Thers one more entry that I
found from the ldapsearch to
be bit low<br>
<br>
nsslapd-dncachememsize:
10485760<br>
maxdncachesize: 10485760<br>
<br>
</div>
<div>Should I update these as
well to a higher value<br>
<br>
</div>
<div>At the time when the issue
happened, the memory usage as
well as the overall load of
the system was very low . <br>
I will try reproducing the
issue atleast in my QA
env..probably by trying to
mock simultaneous parallel
logins to a large number of
hosts <br>
</div>
</div>
</blockquote>
<br>
</span> To monitor your cache sizes,
please use the dbmon.sh tool provided
with your distro. If that is not
available with your particular distro,
see <a moz-do-not-send="true"
href="https://github.com/richm/scripts/wiki/dbmon.sh"
target="_blank">https://github.com/richm/scrip<wbr>ts/wiki/dbmon.sh</a>
<div>
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
<br>
</div>
<div>thanks<br>
</div>
<div>Rakesh<br>
</div>
<div><br>
</div>
<div><br>
</div>
<br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On
Mon, Aug 29, 2016 at 8:16
PM, thierry bordaz <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:tbordaz@redhat.com"
target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:tbordaz@redhat.com">tbordaz@redhat.com</a></a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div text="#000000"
bgcolor="#FFFFFF"> Hi
Rakesh,<br>
<br>
Those tuning may depend
on the memory available
on your machine. <br>
nsslapd-cachememsize
allows the entry cache
to consume up to 200Mb
but its memory footprint
is known to go above. <br>
200Mb both looks pretty
good to me. How large is
your machine ? What is
your version of 389-ds ?<br>
<br>
Those warnings do not
change your settings. It
just raise that entry
cache of 'ipaca' and
'retrocl' are small but
it is fine. The size of
the entry cache is
important mostly in
userRoot.<br>
You may double check the
actual values, after
restart, with ldapsearch
on 'cn=userRoot,cn=ldbm
database,cn=plugins,cn=config'
and 'cn=config,cn=ldbm
database,cn=plugins,cn=config'<wbr>.<br>
<br>
A step is to know what
will be response time of
DS to know if it is
responsible of the hang
or not.<br>
The logs and possibly
pstack during those
intermittent hangs will
help to determine that.<br>
<br>
regards<span><font
color="#888888"><br>
thierry</font></span>
<div>
<div><br>
<br>
<br>
<br>
<br>
<div>On 08/29/2016
04:25 PM, Rakesh
Rajasekharan
wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>
<div>I tried
increasing the
nsslapd-dbcachesize
and
nsslapd-cachememsize
in my QA envs
to 200MB.<br>
<br>
</div>
However, in my
log files, I
still see this
message<br>
[29/Aug/2016:04:34:37
+0000] -
WARNING:
ipaca: entry
cache size
10485760B is
less than db
size
11599872B; We
recommend to
increase the
entry cache
size
nsslapd-cachememsize.<br>
[29/Aug/2016:04:34:37
+0000] -
WARNING:
changelog:
entry cache
size 2097152B
is less than
db size
441647104B; We
recommend to
increase the
entry cache
size
nsslapd-cachememsize.<br>
<br>
</div>
these are my
ldif files
that i used to
modify the
values<br>
modify entry
cache size<br>
cat
modify-cache-mem-size.ldif<br>
dn:
cn=userRoot,cn=ldbm
database,cn=plugins,cn=config<br>
changetype:
modify<br>
replace:
nsslapd-cachememsize<br>
nsslapd-cachememsize:
209715200<br>
<br>
modify db
cache size<br>
cat
modfy-db-cache-size.ldif<br>
dn:
cn=config,cn=ldbm
database,cn=plugins,cn=config<br>
changetype:
modify<br>
replace:
nsslapd-dbcachesize<br>
nsslapd-dbcachesize:
209715200<br>
<br>
</div>
After
modifying , i
restarted IPA
services<br>
<br>
</div>
Is there
anything else
that I need
to take care
of as the logs
suggest its
still not
getting the
updated values<br>
<br>
</div>
Thanks<br>
</div>
Rakesh<br>
</div>
<div
class="gmail_extra"><br>
<div
class="gmail_quote">On
Mon, Aug 29,
2016 at 6:07
PM, Rakesh
Rajasekharan <span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:rakesh.rajasekharan@gmail.com"
target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:rakesh.rajasekharan@gmail.com">rakesh.rajasekharan@gmail.com</a></a><wbr>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>
<div>Hi
Thierry,<br>
<br>
</div>
Coz of the
issues we had
to revert back
to earlier
running
openldap in
production.<br>
<br>
</div>
I have now
done a few TCP
related
changes in
sysctl.conf
and have also
increased the
nsslapd-dbcachesize
and
nsslapd-cachememsize
to 200MB<br>
<br>
</div>
I will again
start
migrating
hosts back to
IPA and see if
I face the
earlier issue.<br>
<br>
</div>
I will update
back once I
have something<br>
<br>
<br>
</div>
Thanks,<br>
</div>
Rakesh<br>
<div>
<div>
<div>
<div><br>
<br>
</div>
</div>
</div>
</div>
</div>
<div>
<div>
<div
class="gmail_extra"><br>
<div
class="gmail_quote">On
Thu, Aug 25,
2016 at 2:17
PM, thierry
bordaz <span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:tbordaz@redhat.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:tbordaz@redhat.com">tbordaz@redhat.com</a></a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div
text="#000000"
bgcolor="#FFFFFF">
<div>
<div> <br>
<br>
<div>On
08/25/2016
10:15 AM,
Rakesh
Rajasekharan
wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">
<div>All of
the
troubleshooting
seems fine.<br>
<br>
<br>
</div>
<div>However,
Running <a
moz-do-not-send="true"
href="http://libconv.pl" target="_blank">libconv.pl</a> gives me this
output<br>
<br>
-----
Recommendations
-----<br>
<br>
1. You have
unindexed
components,
this can be
caused from a
search on an
unindexed
attribute, or
your returned
results
exceeded the
allidsthreshold.
Unindexed
components are
not
recommended.
To refuse
unindexed
searches,
switch
'nsslapd-require-index'
to 'on' under
your database
entry (e.g.
cn=UserRoot,cn=ldbm
database,cn=plugins,cn=config)<wbr>.<br>
<br>
2. You have
a significant
difference
between binds
and unbinds.
You may want
to investigate
this
difference.<br>
<br>
</div>
<div><br>
</div>
<div>I feel,
this could be
a pointer to
things going
slow.. and IPA
hanging. I
think i now
have something
that I can try
and nail down
this issue.<br>
<br>
On a sidenote,
I was earlier
running
openldap and
migrated over
to Freeipa, <br>
<br>
</div>
<div>Thanks<br>
</div>
<div>Rakesh<br>
</div>
<div><br>
<br>
</div>
</div>
<div
class="gmail_extra"><br>
<div
class="gmail_quote">On
Wed, Aug 24,
2016 at 12:38
PM, Petr
Spacek <span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:pspacek@redhat.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:pspacek@redhat.com">pspacek@redhat.com</a></a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex"><span>On
23.8.2016
18:44, Rakesh
Rajasekharan
wrote:<br>
> I think
thers
something
seriously
wrong with my
system<br>
><br>
> not able
to run any
IPA commands<br>
><br>
> klist<br>
> Ticket
cache:
KEYRING:persistent:0:0<br>
> Default
principal: <a
moz-do-not-send="true" href="mailto:admin@XYZ.COM" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:admin@XYZ.COM">admin@XYZ.COM</a></a><br>
><br>
> Valid
starting
Expires
Service
principal<br>
>
2016-08-23T16:26:36
2016-08-24T16:26:22
krbtgt/<a
moz-do-not-send="true"
href="mailto:XYZ.COM@XYZ.COM" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:XYZ.COM@XYZ.COM">XYZ.COM@XYZ.COM</a></a><br>
><br>
><br>
>
[root@prod-ipa-master-1a
:~] ipactl
status<br>
> Directory
Service:
RUNNING<br>
> krb5kdc
Service:
RUNNING<br>
> kadmin
Service:
RUNNING<br>
>
ipa_memcached
Service:
RUNNING<br>
> httpd
Service:
RUNNING<br>
>
pki-tomcatd
Service:
RUNNING<br>
> ipa-otpd
Service:
RUNNING<br>
> ipa:
INFO: The
ipactl command
was successful<br>
><br>
><br>
><br>
>
[root@prod-ipa-master
:~] ipa
user-find
p-testuser<br>
> ipa:
ERROR:
Kerberos
error:
('Unspecified
GSS failure.
Minor code may<br>
> provide
more
information',
851968)/("Cannot
contact any
KDC for realm
'<br>
> <a
moz-do-not-send="true"
href="http://XYZ.COM" rel="noreferrer" target="_blank">XYZ.COM</a>'",
-1765328228)<br>
</span></blockquote>
</div>
</div>
</blockquote>
<br>
</div>
</div>
Hi Rakesh,<br>
<br>
<blockquote>Having
a reproducible
test case
would you
rerun the
command above.<br>
During its
processing you
may monitor DS
process load
(top). If it
is high, you
may get some
pstacks of it.<br>
Also would you
attach the
part of DS
access logs
taken during
the command.<br>
<br>
regards<br>
thierry<br>
</blockquote>
<div>
<div>
<blockquote
type="cite">
<div
class="gmail_extra">
<div
class="gmail_quote">
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex"><span>
><br>
<br>
</span>This is
weird because
the server
seems to be
up.<br>
<br>
Please follow<br>
<a
moz-do-not-send="true"
href="http://www.freeipa.org/page/Tr" target="_blank"><a class="moz-txt-link-freetext" href="http://www.freeipa.org/page/Tr">http://www.freeipa.org/page/Tr</a></a><wbr>oubleshooting#Authentication.2<wbr>FKerberos<br>
<br>
Petr^2 Spacek<br>
<div>
<div><br>
><br>
><br>
> Thanks<br>
><br>
> Rakesh<br>
><br>
> On Tue,
Aug 23, 2016
at 10:01 PM,
Rakesh
Rajasekharan
<<br>
> <a
moz-do-not-send="true"
href="mailto:rakesh.rajasekharan@gmail.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:rakesh.rajasekharan@gmail.com">rakesh.rajasekharan@gmail.com</a></a>>
wrote:<br>
><br>
>> i
changed the
loggin level
to 4 .
Modifying
nsslapd-accesslog-level<br>
>><br>
>> But,
the hang is
still there.
though I dont
see the
sigfault now<br>
>><br>
>><br>
>><br>
>><br>
>> On
Tue, Aug 23,
2016 at 9:02
PM, Rakesh
Rajasekharan
<<br>
>> <a
moz-do-not-send="true"
href="mailto:rakesh.rajasekharan@gmail.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:rakesh.rajasekharan@gmail.com">rakesh.rajasekharan@gmail.com</a></a>>
wrote:<br>
>><br>
>>>
My disk was
getting filled
too fast<br>
>>><br>
>>>
logs under
/var/log/dirsrv
was coming
around 5 gb
quickly
filling up<br>
>>><br>
>>>
Is there a way
to make the
logging less
verbose<br>
>>><br>
>>><br>
>>><br>
>>>
On Tue, Aug
23, 2016 at
6:41 PM, Petr
Spacek <<a
moz-do-not-send="true" href="mailto:pspacek@redhat.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:pspacek@redhat.com">pspacek@redhat.com</a></a>>
wrote:<br>
>>><br>
>>>>
On 23.8.2016
15:07, Rakesh
Rajasekharan
wrote:<br>
>>>>>
I was able to
fix that may
be
temporarily...
when i checked
the<br>
>>>>
network..<br>
>>>>>
there was
another
process that
was running
and consuming
a lot of<br>
>>>>
network (<br>
>>>>>
i have no idea
who did that.
I need to
seriously
start
restricting<br>
>>>>
people<br>
>>>>>
access to this
machine )<br>
>>>>><br>
>>>>>
after killing
that
perfomance
improved
drastically<br>
>>>>><br>
>>>>>
But now,
suddenly I
started
experiencing
the same hang.<br>
>>>>><br>
>>>>>
This time , I
gert the
following
error when
checked dmesg<br>
>>>>><br>
>>>>>
[ 301.236976]
ns-slapd[3124]:
segfault at 0
ip
00007f1de416951c
sp<br>
>>>>>
00007f1dee1dba70
error 4 in
libcos-plugin.so[7f1de4166000+<wbr>b000]<br>
>>>>>
[ 1116.248431]
TCP:
request_sock_TCP:
Possible SYN
flooding on
port 88.<br>
>>>>>
Sending
cookies.
Check SNMP
counters.<br>
>>>>>
[11831.397037]
ns-slapd[22550]:
segfault at 0
ip
00007f533d82251c
sp<br>
>>>>>
00007f5347894a70
error 4 in
libcos-plugin.so[7f533d81f000+<wbr>b000]<br>
>>>>>
[11832.727989]
ns-slapd[22606]:
segfault at 0
ip
00007f6231eb951c
sp<br>
>>>>>
00007f623bf2ba70
error 4 in
libcos-plugin.so[7f6231eb6000+<wbr>b00<br>
>>>><br>
>>>>
Okay, this one
is serious.
The LDAP
server
crashed.<br>
>>>><br>
>>>>
1. Make sure
all your
packages are
up-to-date.<br>
>>>><br>
>>>>
Please see<br>
>>>>
<a
moz-do-not-send="true"
href="http://directory.fedoraproject" target="_blank"><a class="moz-txt-link-freetext" href="http://directory.fedoraproject">http://directory.fedoraproject</a></a><wbr>.org/docs/389ds/FAQ/faq.html#d<br>
>>>>
ebugging-crashes<br>
>>>>
for further
instructions
how to debug
this.<br>
>>>><br>
>>>>
Petr^2 Spacek<br>
>>>><br>
>>>>><br>
>>>>>
and in
/var/log/dirsrv/example-com/er<wbr>rors<br>
>>>>><br>
>>>>>
[23/Aug/2016:12:49:36
+0000]
DSRetroclPlugin
-
delete_changerecord:<br>
>>>>
could<br>
>>>>>
not delete
change record
3291138 (rc:
32)<br>
>>>>>
[23/Aug/2016:12:49:36
+0000]
DSRetroclPlugin
-
delete_changerecord:<br>
>>>>
could<br>
>>>>>
not delete
change record
3291139 (rc:
32)<br>
>>>>>
[23/Aug/2016:12:49:36
+0000]
DSRetroclPlugin
-
delete_changerecord:<br>
>>>>
could<br>
>>>>>
not delete
change record
3291140 (rc:
32)<br>
>>>>>
[23/Aug/2016:12:49:36
+0000]
DSRetroclPlugin
-
delete_changerecord:<br>
>>>>
could<br>
>>>>>
not delete
change record
3291141 (rc:
32)<br>
>>>>>
[23/Aug/2016:12:49:36
+0000]
DSRetroclPlugin
-
delete_changerecord:<br>
>>>>
could<br>
>>>>>
not delete
change record
3291142 (rc:
32)<br>
>>>>>
[23/Aug/2016:12:49:36
+0000]
DSRetroclPlugin
-
delete_changerecord:<br>
>>>>
could<br>
>>>>>
not delete
change record
3291143 (rc:
32)<br>
>>>>>
[23/Aug/2016:12:49:36
+0000]
DSRetroclPlugin
-
delete_changerecord:<br>
>>>>
could<br>
>>>>>
not delete
change record
3291144 (rc:
32)<br>
>>>>>
[23/Aug/2016:12:49:36
+0000]
DSRetroclPlugin
-
delete_changerecord:<br>
>>>>
could<br>
>>>>>
not delete
change record
3291145 (rc:
32)<br>
>>>>>
[23/Aug/2016:12:49:50
+0000] - Retry
count exceeded
in delete<br>
>>>>>
[23/Aug/2016:12:49:50
+0000]
DSRetroclPlugin
-
delete_changerecord:<br>
>>>>
could<br>
>>>>>
not delete
change record
3292734 (rc:
51)<br>
>>>>><br>
>>>>><br>
>>>>>
Can i do
something
about this
error.. I
treid to
restart ipa a
couple<br>
>>>>
of<br>
>>>>>
time but that
did not help<br>
>>>>><br>
>>>>>
Thanks<br>
>>>>>
Rakesh<br>
>>>>><br>
>>>>>
On Mon, Aug
22, 2016 at
2:27 PM, Petr
Spacek <<a
moz-do-not-send="true" href="mailto:pspacek@redhat.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:pspacek@redhat.com">pspacek@redhat.com</a></a>><br>
>>>>
wrote:<br>
>>>>><br>
>>>>>>
On 19.8.2016
19:32, Rakesh
Rajasekharan
wrote:<br>
>>>>>>>
I am running
my set up on
AWS cloud, and
entropy is low
at around<br>
>>>>
180 .<br>
>>>>>>><br>
>>>>>>>
I plan to
increase it bu
installing
haveged . But,
would low
entropy<br>
>>>>
by<br>
>>>>>>
any<br>
>>>>>>>
chance cause
this issue of
intermittent
hang .<br>
>>>>>>>
Also, the hang
is mostly
observed when
registering
around 20
clients<br>
>>>>>>>
together<br>
>>>>>><br>
>>>>>>
Possibly, I'm
not sure. If
you want to
dig into this,
I would do
this:<br>
>>>>>>
1. look what
process hangs
on client
(using pstree
command or so)<br>
>>>>>>
$ pstree<br>
>>>>>><br>
>>>>>>
2. look to
what server
and port is
the hanging
client
connected to<br>
>>>>>>
$ lsof -p
<PID of the
hanging
process><br>
>>>>>><br>
>>>>>>
3. jump to
server and see
what process
is bound to
the target
port<br>
>>>>>>
$ netstat -pn<br>
>>>>>><br>
>>>>>>
4. see where
the process if
hanging<br>
>>>>>>
$ strace -p
<PID of the
hanging
process><br>
>>>>>><br>
>>>>>>
I hope it
helps.<br>
>>>>>><br>
>>>>>>
Petr^2 Spacek<br>
>>>>>><br>
>>>>>>>
On Fri, Aug
19, 2016 at
7:24 PM,
Rakesh
Rajasekharan
<<br>
>>>>>>>
<a
moz-do-not-send="true"
href="mailto:rakesh.rajasekharan@gmail.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:rakesh.rajasekharan@gmail.com">rakesh.rajasekharan@gmail.com</a></a>>
wrote:<br>
>>>>>>><br>
>>>>>>>>
yes there
seems to be
something
thats
worrying.. I
have faced
this<br>
>>>>
today<br>
>>>>>>>>
as well.<br>
>>>>>>>>
There are few
hosts around
280 odd left
and when i try
adding them<br>
>>>>
to<br>
>>>>>>
IPA<br>
>>>>>>>>
, the slowness
begins..<br>
>>>>>>>><br>
>>>>>>>>
all the ipa
commands like
ipa
user-find..
etc becomes
very slow in<br>
>>>>>>>>
responding.<br>
>>>>>>>><br>
>>>>>>>>
the SYNC_RECV
are not many
though just
around 80-90
and today that<br>
>>>>
was<br>
>>>>>>>>
around 20 only<br>
>>>>>>>><br>
>>>>>>>><br>
>>>>>>>>
I have for now
increased
tcp_max_syn_backlog
to 5000.<br>
>>>>>>>>
For now the
slowness seems
to have gone..
but I will do
a try<br>
>>>>
adding the<br>
>>>>>>>>
clients again
tomorrow and
see how it
goes<br>
>>>>>>>><br>
>>>>>>>>
Thanks<br>
>>>>>>>>
Rakesh<br>
>>>>>>>><br>
>>>>>>>>
The issues<br>
>>>>>>>><br>
>>>>>>>>
On Fri, Aug
19, 2016 at
12:58 PM, Petr
Spacek <<a
moz-do-not-send="true" href="mailto:pspacek@redhat.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:pspacek@redhat.com">pspacek@redhat.com</a></a>><br>
>>>>>>
wrote:<br>
>>>>>>>><br>
>>>>>>>>>
On 18.8.2016
17:23, Rakesh
Rajasekharan
wrote:<br>
>>>>>>>>>>
Hi<br>
>>>>>>>>>><br>
>>>>>>>>>>
I am migrating
to freeipa
from openldap
and have
around 4000<br>
>>>>
clients<br>
>>>>>>>>>><br>
>>>>>>>>>>
I had openned
a another
thread on
that, but
chose to start
a new<br>
>>>>
one<br>
>>>>>>>>>
here<br>
>>>>>>>>>>
as its a
separate issue<br>
>>>>>>>>>><br>
>>>>>>>>>>
I was able to
change the
nssslapd-maxdescriptors
adding an ldif<br>
>>>>
file<br>
>>>>>>>>>><br>
>>>>>>>>>>
cat
nsslapd-modify.ldif<br>
>>>>>>>>>>
dn: cn=config<br>
>>>>>>>>>>
changetype:
modify<br>
>>>>>>>>>>
replace:
nsslapd-maxdescriptors<br>
>>>>>>>>>>
nsslapd-maxdescriptors:
17000<br>
>>>>>>>>>><br>
>>>>>>>>>>
and running
the ldapmodify
command<br>
>>>>>>>>>><br>
>>>>>>>>>>
I have now
started moving
clients
running an
openldap to
Freeipa<br>
>>>>
and<br>
>>>>>>>>>
have<br>
>>>>>>>>>>
today moved
close to 2000
clients<br>
>>>>>>>>>><br>
>>>>>>>>>>
However, I
have noticed
that IPA hangs
intermittently.<br>
>>>>>>>>>><br>
>>>>>>>>>>
running a
kinit admin
returns the
below error<br>
>>>>>>>>>>
kinit: Generic
error (see
e-text) while
getting
initial<br>
>>>>
credentials<br>
>>>>>>>>>><br>
>>>>>>>>>>
from the
/var/log/messages,
I see this
entry<br>
>>>>>>>>>><br>
>>>>>>>>>>
prod-ipa-master-int
kernel:
[104090.315801]
TCP:<br>
>>>>
request_sock_TCP:<br>
>>>>>>>>>>
Possible SYN
flooding on
port 88.
Sending
cookies.
Check SNMP<br>
>>>>>>
counters.<br>
>>>>>>>>><br>
>>>>>>>>>
I would be
worried about
this message.
Maybe
kernel/firewall
is<br>
>>>>
doing<br>
>>>>>>>>>
something
fishy behind
your back and
blocking some
connections or<br>
>>>>
so.<br>
>>>>>>>>><br>
>>>>>>>>>
Petr^2 Spacek<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>>>
Aug 18
13:00:01
prod-ipa-master-int
systemd[1]:
Started
Session<br>
>>>>
4885<br>
>>>>>>
of<br>
>>>>>>>>>>
user root.<br>
>>>>>>>>>>
Aug 18
13:00:01
prod-ipa-master-int
systemd[1]:
Starting
Session<br>
>>>>
4885<br>
>>>>>>
of<br>
>>>>>>>>>>
user root.<br>
>>>>>>>>>>
Aug 18
13:01:01
prod-ipa-master-int
systemd[1]:
Started
Session<br>
>>>>
4886<br>
>>>>>>
of<br>
>>>>>>>>>>
user root.<br>
>>>>>>>>>>
Aug 18
13:01:01
prod-ipa-master-int
systemd[1]:
Starting
Session<br>
>>>>
4886<br>
>>>>>>
of<br>
>>>>>>>>>>
user root.<br>
>>>>>>>>>>
Aug 18
13:02:40
prod-ipa-master-int
python[28984]:
ansible-command<br>
>>>>>>>>>
Invoked<br>
>>>>>>>>>>
with
creates=None
executable=None
shell=True
args=
removes=None<br>
>>>>>>>>>
warn=True<br>
>>>>>>>>>>
chdir=None<br>
>>>>>>>>>>
Aug 18
13:04:37
prod-ipa-master-int
sssd_be:
GSSAPI Error:<br>
>>>>
Unspecified<br>
>>>>>>>>>
GSS<br>
>>>>>>>>>>
failure.
Minor code may
provide more
information
(KDC returned<br>
>>>>
error<br>
>>>>>>>>>>
string:
PROCESS_TGS)<br>
>>>>>>>>>><br>
>>>>>>>>>>
Could it be
possible that
its due to the
initial load
of adding<br>
>>>>
the<br>
>>>>>>>>>
clients<br>
>>>>>>>>>>
or is there
something else
that I need to
take care of.<br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset></fieldset>
<br>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset></fieldset>
<br>
</blockquote>
<p><br>
<span class="HOEnZb"><font
color="#888888"> </font></span></p>
<span class="HOEnZb"><font
color="#888888"> </font></span></div>
<span class="HOEnZb"><font
color="#888888"> </font></span></div>
<span class="HOEnZb"><font
color="#888888"> </font></span></div>
<span class="HOEnZb"><font
color="#888888"> <br>
</font></span></div>
<span class="HOEnZb"><font color="#888888">
</font></span></div>
<span class="HOEnZb"><font color="#888888">
<span>--<br>
Manage your subscription for the
Freeipa-users mailing list:<br>
<a moz-do-not-send="true"
href="https://www.redhat.com/mailman/listinfo/freeipa-users"
rel="noreferrer" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/freeipa-users</a><br>
Go to <a moz-do-not-send="true"
href="http://freeipa.org"
rel="noreferrer" target="_blank">http://freeipa.org</a>
for more info on the project<br>
</span></font></span></blockquote>
<span class="HOEnZb"><font color="#888888"> </font></span></div>
<span class="HOEnZb"><font color="#888888"> <br>
</font></span></div>
<span class="HOEnZb"><font color="#888888"> </font></span></blockquote>
<span class="HOEnZb"><font color="#888888"> </font></span></div>
<span class="HOEnZb"><font color="#888888"> <br>
</font></span></div>
<span class="HOEnZb"><font color="#888888"> <br>
<fieldset></fieldset>
<br>
</font></span></blockquote>
<br>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</body>
</html>