[Freeipa-users] Freeipa 4.2.0 hangs intermittently

Rakesh Rajasekharan rakesh.rajasekharan at gmail.com
Tue Aug 23 16:44:01 UTC 2016


I think thers something seriously wrong with my system

not able to run any  IPA commands

klist
Ticket cache: KEYRING:persistent:0:0
Default principal: admin at XYZ.COM

Valid starting       Expires              Service principal
2016-08-23T16:26:36  2016-08-24T16:26:22  krbtgt/XYZ.COM at XYZ.COM


[root at prod-ipa-master-1a :~] ipactl status
Directory Service: RUNNING
krb5kdc Service: RUNNING
kadmin Service: RUNNING
ipa_memcached Service: RUNNING
httpd Service: RUNNING
pki-tomcatd Service: RUNNING
ipa-otpd Service: RUNNING
ipa: INFO: The ipactl command was successful



[root at prod-ipa-master :~] ipa user-find p-testuser
ipa: ERROR: Kerberos error: ('Unspecified GSS failure.  Minor code may
provide more information', 851968)/("Cannot contact any KDC for realm '
XYZ.COM'", -1765328228)



Thanks

Rakesh

On Tue, Aug 23, 2016 at 10:01 PM, Rakesh Rajasekharan <
rakesh.rajasekharan at gmail.com> wrote:

> i changed the loggin level to 4 . Modifying nsslapd-accesslog-level
>
> But, the hang is still there. though I dont see the sigfault now
>
>
>
>
> On Tue, Aug 23, 2016 at 9:02 PM, Rakesh Rajasekharan <
> rakesh.rajasekharan at gmail.com> wrote:
>
>> My disk was getting filled too fast
>>
>> logs under /var/log/dirsrv was coming around 5 gb quickly filling up
>>
>> Is there a way to make the logging less verbose
>>
>>
>>
>> On Tue, Aug 23, 2016 at 6:41 PM, Petr Spacek <pspacek at redhat.com> wrote:
>>
>>> On 23.8.2016 15:07, Rakesh Rajasekharan wrote:
>>> > I was able to fix that may be temporarily... when i checked the
>>> network..
>>> > there was another process that was running and consuming a lot of
>>> network (
>>> > i have no idea who did that. I need to seriously start restricting
>>> people
>>> > access to this machine )
>>> >
>>> > after killing that perfomance improved drastically
>>> >
>>> > But now, suddenly I started experiencing the same hang.
>>> >
>>> > This time , I gert the following error when checked dmesg
>>> >
>>> > [  301.236976] ns-slapd[3124]: segfault at 0 ip 00007f1de416951c sp
>>> > 00007f1dee1dba70 error 4 in libcos-plugin.so[7f1de4166000+b000]
>>> > [ 1116.248431] TCP: request_sock_TCP: Possible SYN flooding on port 88.
>>> > Sending cookies.  Check SNMP counters.
>>> > [11831.397037] ns-slapd[22550]: segfault at 0 ip 00007f533d82251c sp
>>> > 00007f5347894a70 error 4 in libcos-plugin.so[7f533d81f000+b000]
>>> > [11832.727989] ns-slapd[22606]: segfault at 0 ip 00007f6231eb951c sp
>>> > 00007f623bf2ba70 error 4 in libcos-plugin.so[7f6231eb6000+b00
>>>
>>> Okay, this one is serious. The LDAP server crashed.
>>>
>>> 1. Make sure all your packages are up-to-date.
>>>
>>> Please see
>>> http://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#d
>>> ebugging-crashes
>>> for further instructions how to debug this.
>>>
>>> Petr^2 Spacek
>>>
>>> >
>>> > and in /var/log/dirsrv/example-com/errors
>>> >
>>> > [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291138 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291139 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291140 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291141 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291142 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291143 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291144 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291145 (rc: 32)
>>> > [23/Aug/2016:12:49:50 +0000] - Retry count exceeded in delete
>>> > [23/Aug/2016:12:49:50 +0000] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3292734 (rc: 51)
>>> >
>>> >
>>> > Can  i do something about this error.. I treid to restart ipa a couple
>>> of
>>> > time but that did not help
>>> >
>>> > Thanks
>>> > Rakesh
>>> >
>>> > On Mon, Aug 22, 2016 at 2:27 PM, Petr Spacek <pspacek at redhat.com>
>>> wrote:
>>> >
>>> >> On 19.8.2016 19:32, Rakesh Rajasekharan wrote:
>>> >>> I am running my set up on AWS cloud, and entropy is low at around
>>> 180 .
>>> >>>
>>> >>> I plan to increase it bu installing haveged . But, would low entropy
>>> by
>>> >> any
>>> >>> chance cause this issue of intermittent hang .
>>> >>> Also, the hang is mostly observed when registering around 20 clients
>>> >>> together
>>> >>
>>> >> Possibly, I'm not sure. If you want to dig into this, I would do this:
>>> >> 1. look what process hangs on client (using pstree command or so)
>>> >> $ pstree
>>> >>
>>> >> 2. look to what server and port is the hanging client connected to
>>> >> $ lsof -p <PID of the hanging process>
>>> >>
>>> >> 3. jump to server and see what process is bound to the target port
>>> >> $ netstat -pn
>>> >>
>>> >> 4. see where the process if hanging
>>> >> $ strace -p <PID of the hanging process>
>>> >>
>>> >> I hope it helps.
>>> >>
>>> >> Petr^2 Spacek
>>> >>
>>> >>> On Fri, Aug 19, 2016 at 7:24 PM, Rakesh Rajasekharan <
>>> >>> rakesh.rajasekharan at gmail.com> wrote:
>>> >>>
>>> >>>> yes there seems to be something thats worrying.. I have faced this
>>> today
>>> >>>> as well.
>>> >>>> There are few hosts around 280 odd left and when i try adding them
>>> to
>>> >> IPA
>>> >>>> , the slowness begins..
>>> >>>>
>>> >>>> all the ipa commands like ipa user-find.. etc becomes very slow in
>>> >>>> responding.
>>> >>>>
>>> >>>> the SYNC_RECV are not many though just around 80-90 and today that
>>> was
>>> >>>> around 20 only
>>> >>>>
>>> >>>>
>>> >>>> I have for now increased tcp_max_syn_backlog to 5000.
>>> >>>> For now the slowness seems to have gone.. but I will do a try
>>> adding the
>>> >>>> clients again tomorrow and see how it goes
>>> >>>>
>>> >>>> Thanks
>>> >>>> Rakesh
>>> >>>>
>>> >>>> The issues
>>> >>>>
>>> >>>> On Fri, Aug 19, 2016 at 12:58 PM, Petr Spacek <pspacek at redhat.com>
>>> >> wrote:
>>> >>>>
>>> >>>>> On 18.8.2016 17:23, Rakesh Rajasekharan wrote:
>>> >>>>>> Hi
>>> >>>>>>
>>> >>>>>> I am migrating to freeipa from openldap and have around 4000
>>> clients
>>> >>>>>>
>>> >>>>>> I had openned a another thread on that, but chose to start a new
>>> one
>>> >>>>> here
>>> >>>>>> as its a separate issue
>>> >>>>>>
>>> >>>>>> I was able to change the nssslapd-maxdescriptors adding an ldif
>>> file
>>> >>>>>>
>>> >>>>>> cat nsslapd-modify.ldif
>>> >>>>>> dn: cn=config
>>> >>>>>> changetype: modify
>>> >>>>>> replace: nsslapd-maxdescriptors
>>> >>>>>> nsslapd-maxdescriptors: 17000
>>> >>>>>>
>>> >>>>>> and running the ldapmodify command
>>> >>>>>>
>>> >>>>>> I have now started moving clients running an openldap to Freeipa
>>> and
>>> >>>>> have
>>> >>>>>> today moved close to 2000 clients
>>> >>>>>>
>>> >>>>>> However, I have noticed that IPA hangs intermittently.
>>> >>>>>>
>>> >>>>>> running a kinit admin returns the below error
>>> >>>>>> kinit: Generic error (see e-text) while getting initial
>>> credentials
>>> >>>>>>
>>> >>>>>> from the /var/log/messages, I see this entry
>>> >>>>>>
>>> >>>>>>  prod-ipa-master-int kernel: [104090.315801] TCP:
>>> request_sock_TCP:
>>> >>>>>> Possible SYN flooding on port 88. Sending cookies.  Check SNMP
>>> >> counters.
>>> >>>>>
>>> >>>>> I would be worried about this message. Maybe kernel/firewall is
>>> doing
>>> >>>>> something fishy behind your back and blocking some connections or
>>> so.
>>> >>>>>
>>> >>>>> Petr^2 Spacek
>>> >>>>>
>>> >>>>>
>>> >>>>>> Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Started Session
>>> 4885
>>> >> of
>>> >>>>>> user root.
>>> >>>>>> Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Starting Session
>>> 4885
>>> >> of
>>> >>>>>> user root.
>>> >>>>>> Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Started Session
>>> 4886
>>> >> of
>>> >>>>>> user root.
>>> >>>>>> Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Starting Session
>>> 4886
>>> >> of
>>> >>>>>> user root.
>>> >>>>>> Aug 18 13:02:40 prod-ipa-master-int python[28984]: ansible-command
>>> >>>>> Invoked
>>> >>>>>> with creates=None executable=None shell=True args= removes=None
>>> >>>>> warn=True
>>> >>>>>> chdir=None
>>> >>>>>> Aug 18 13:04:37 prod-ipa-master-int sssd_be: GSSAPI Error:
>>> Unspecified
>>> >>>>> GSS
>>> >>>>>> failure.  Minor code may provide more information (KDC returned
>>> error
>>> >>>>>> string: PROCESS_TGS)
>>> >>>>>>
>>> >>>>>> Could it be possible that its due to the initial load of adding
>>> the
>>> >>>>> clients
>>> >>>>>> or is there something else that I need to take care of.
>>> >>
>>> >
>>>
>>>
>>> --
>>> Petr Spacek  @  Red Hat
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/freeipa-users/attachments/20160823/fc57a87c/attachment.htm>


More information about the Freeipa-users mailing list