[Freeipa-devel] slow response

Rich Megginson rmeggins at redhat.com
Wed Aug 1 20:54:14 UTC 2012


On 08/01/2012 02:49 PM, Loris Santamaria wrote:
> El mié, 01-08-2012 a las 13:50 -0600, Rich Megginson escribió:
>> On 08/01/2012 01:34 PM, Loris Santamaria wrote:
>>> El mié, 01-08-2012 a las 09:58 -0600, Rich Megginson escribió:
>>>> On 08/01/2012 09:20 AM, Loris Santamaria wrote:
>>>>> El mié, 01-08-2012 a las 10:35 -0400, Simo Sorce escribió:
>>>>>> On Tue, 2012-07-31 at 08:49 -0700, Stephen Ingram wrote:
>>>>>>> On Tue, Jul 31, 2012 at 1:39 AM, Petr Spacek<pspacek at redhat.com>   wrote:
>>>>>>>> On 07/31/2012 12:27 AM, John Dennis wrote:
>>>>>>>>> What is taking so long with session bookkeeping? I don't know yet. I would
>>>>>>>>> need more timing instrumentation. I will say when I looked at the
>>>>>>>>> python-krb5
>>>>>>>>> code (which we use to populate the ccache from the session and read back
>>>>>>>>> to
>>>>>>>>> store in the session) seemed to be remarkably inefficient. We also elected
>>>>>>>>> to
>>>>>>>>> use file based ccache rather than in-memory ccache (that means there is a
>>>>>>>>> bit
>>>>>>>>> of file-IO occurring).
>>>>>>>> A note regarding python-krbV:
>>>>>>>> I used python-krbV extensively in my thesis for KDC stress test. Python-krbV
>>>>>>>> can obtain several thousands of TGTs per second (even with ccache in a
>>>>>>>> file). AFAIK VFS calls are not done synchronously. But others parts of
>>>>>>>> python-krbV were left uncovered, so it can contain some surprises.
>>>>>>>>
>>>>>>>> === Wild speculation follows ===
>>>>>>>> 1.5 second is incredibly long time, it sounds like some kind of timeout.
>>>>>>>> Krb5 libs have usual timeout = 1 second per request.
>>>>>>>>
>>>>>>>> Are all KDCs in /etc/krb5.conf alive and reachable?
>>>>>>> In this case, as I'm referring to the extreme slowness of the Web UI,
>>>>>>> the KDC is on the same system (the ipa server) that is making the
>>>>>>> request, correct?
>>>>>>>
>>>>>>>> Is SSSD running on problematic server?
>>>>>>> Yes. Again, I'm guessing the problematic server is the IPA server itself.
>>>>>>>
>>>>>>>> Is proper KDC selected by SSSD KDC auto-locator plugin?
>>>>>>>> (See /var/lib/sss/pubconf/)
>>>>>>> Yes, I checked that file and it is the IP address of the IPA server on
>>>>>>> the same server. Perhaps should this be 127.0.0.1 instead?
>>>>>>>
>>>>>>> I also have checked the resolv.conf file, and indeed the IP points to
>>>>>>> the IPA server itself (same machine) as expected. Both forward and
>>>>>>> reverse DNS work. I'm not really sure what else could be setup
>>>>>>> incorrectly to cause any KDC slowness.
>>>>>>>
>>>>>>> Due to the extreme UI slowness issue, I have not created any replicas
>>>>>>> so this system is it. I'm not so sure I would be able to see the 1.5 s
>>>>>>> delay if it weren't compounded by the overall slowness of the Web UI,
>>>>>>> however, the KDC seems to perform well for other systems in the realm.
>>>>>>> I'm certainly not taxing it with a huge load, but tickets seem to be
>>>>>>> issued without delay.
>>>>>> Stephen,
>>>>>> another user sent me a wireshark trace for a similar performance issue.
>>>>>> So far I see a pause when doing the first leg of a SASL authentication.
>>>>>> This may well explain also your issue.
>>>>> Hi, I experience the same delay in SASL authentication. The number I
>>>>> posted on freeipa-users, show a 1-2 second delay with SASL
>>>>> authentication:
>>>>>
>>>>> # time ldapsearch -x uid=bdteg01662 dn
>>>>> # extended LDIF
>>>>> #
>>>>> # LDAPv3
>>>>> # base<dc=xxx,dc=gob,dc=ve>   (default) with scope subtree
>>>>> # filter: uid=bdteg01662
>>>>> # requesting: dn
>>>>> #
>>>>>
>>>>> # bdteg01662, users, accounts, xxx.gob.ve
>>>>> dn: uid=bdteg01662,cn=users,cn=accounts,dc=xxx,dc=gob,dc=ve
>>>>>
>>>>> # search result
>>>>> search: 2
>>>>> result: 0 Success
>>>>>
>>>>> # numResponses: 2
>>>>> # numEntries: 1
>>>>>
>>>>> real    0m0.006s
>>>>> user    0m0.001s
>>>>> sys     0m0.003s
>>>>>
>>>>> # time ldapsearch -Y GSSAPI uid=bdteg01662 dn
>>>>> SASL/GSSAPI authentication started
>>>>> SASL username: admin at XXX.GOB.VE
>>>>> SASL SSF: 56
>>>>> SASL data security layer installed.
>>>>> # extended LDIF
>>>>> #
>>>>> # LDAPv3
>>>>> # base<dc=xxx,dc=gob,dc=ve>   (default) with scope subtree
>>>>> # filter: uid=bdteg01662
>>>>> # requesting: dn
>>>>> #
>>>>>
>>>>> # bdteg01662, users, accounts, xxx.gob.ve
>>>>> dn: uid=bdteg01662,cn=users,cn=accounts,dc=xxx,dc=gob,dc=ve
>>>>>
>>>>> # search result
>>>>> search: 4
>>>>> result: 0 Success
>>>>>
>>>>> # numResponses: 2
>>>>> # numEntries: 1
>>>>>
>>>>> real    0m2.344s
>>>>> user    0m0.007s
>>>>> sys     0m0.005s
>>>> Can you post excerpts from your 389 access log showing the sequence of
>>>> operations for this connection, bind and search?
>>> Here they are:
>>>
>>> [01/Aug/2012:10:39:40 -041800] conn=33 fd=70 slot=70 connection from 172.18.32.246 to 172.18.32.246
>>> [01/Aug/2012:10:39:40 -041800] conn=33 op=0 BIND dn="" method=sasl version=3 mech=GSSAPI
>>> [01/Aug/2012:10:39:42 -041800] conn=33 op=0 RESULT err=14 tag=97 nentries=0 etime=2, SASL bind in progress
>> Yep, this is it - this should not be taking 2 seconds.  I'd like to see
>> what internal operations are going on in this time.  Try this - follow
>> the directions at http://port389.org/wiki/FAQ#Troubleshooting but for
>> the access log, to turn on Heavy trace output debugging:
>>
>> dn: cn=config
>> changetype: modify
>> replace: nsslapd-accesslog-level
>> nsslapd-accesslog-level: 4
>>
>> Then turn the access log level back to the default (256) after
>> reproducing the problem.  We should then be able to see the sequence of
>> internal operations triggered by the BIND dn="" method=sasl version=3
>> mech=GSSAPI
>>
> Here they are, with the 2 second delay:
>
> [01/Aug/2012:16:09:48 -041800] conn=Internal op=-1 SRCH base="cn=anonymous-limits,cn=etc,dc=XXXX,dc=XXXX,dc=ve" scope=0 filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs="nsLookThroughLimit nsIDListScanLimit nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit nsPagedSizeLimit nsIdleTimeout"
> [01/Aug/2012:16:09:48 -041800] conn=Internal op=-1 RESULT err=0 tag=48 nentries=1 etime=0
> [01/Aug/2012:16:09:48 -041800] conn=Internal op=-1 SRCH base="cn=anonymous-limits,cn=etc,dc=XXXX,dc=XXXX,dc=ve" scope=0 filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs="nsLookThroughLimit nsIDListScanLimit nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit nsPagedSizeLimit nsIdleTimeout"
> [01/Aug/2012:16:09:48 -041800] conn=Internal op=-1 RESULT err=0 tag=48 nentries=1 etime=0
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH base="cn=config,cn=userRoot,cn=ldbm database,cn=plugins,cn=config" scope=1 filter="objectclass=vlvsearch" attrs=ALL
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 nentries=0 etime=0
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH base="cn=anonymous-limits,cn=etc,dc=XXXX,dc=XXXX,dc=ve" scope=0 filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs="nsLookThroughLimit nsIDListScanLimit nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit nsPagedSizeLimit nsIdleTimeout"
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 nentries=1 etime=0
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH base="cn=config,cn=userRoot,cn=ldbm database,cn=plugins,cn=config" scope=1 filter="objectclass=vlvsearch" attrs=ALL
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 nentries=0 etime=0
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH base="cn=anonymous-limits,cn=etc,dc=XXXX,dc=XXXX,dc=ve" scope=0 filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs="nsLookThroughLimit nsIDListScanLimit nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit nsPagedSizeLimit nsIdleTimeout"
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 nentries=1 etime=0
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH base="dc=XXXX,dc=XXXX,dc=ve" scope=2 filter="(krbPrincipalName=admin at XXXX.XXXX.VE)" attrs=ALL
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 nentries=1 etime=0
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH base="uid=admin,cn=users,cn=accounts,dc=XXXX,dc=XXXX,dc=ve" scope=0 filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs="nsLookThroughLimit nsIDListScanLimit nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit nsPagedSizeLimit nsIdleTimeout"
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 nentries=1 etime=0
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH base="uid=admin,cn=users,cn=accounts,dc=XXXX,dc=XXXX,dc=ve" scope=0 filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs=ALL
> [01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 nentries=1 etime=0
>
>
> Just for kick I tried the search that appears in the log, and it returns
> immediately:
>
> ldapsearch -x -b "cn=anonymous-limits,cn=etc,dc=bt,dc=gob,dc=ve" "(|(objectclass=*)(objectclass=ldapsubentry))" nsLookThroughLimit nsIDListScanLimit nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit nsPagedSizeLimit nsIdleTimeout
>
> # extended LDIF
> #
> # LDAPv3
> # base<cn=anonymous-limits,cn=etc,dc=bt,dc=gob,dc=ve>  with scope
> subtree
> # filter: (|(objectclass=*)(objectclass=ldapsubentry))
> # requesting: nsLookThroughLimit nsIDListScanLimit
> nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit
> nsPagedSizeLimit nsIdleTimeout
> #
>
> # anonymous-limits, etc, bt.gob.ve
> dn: cn=anonymous-limits,cn=etc,dc=bt,dc=gob,dc=ve
> nsLookThroughLimit: 5000
> nsSizeLimit: 5000
>
> # search result
> search: 2
> result: 0 Success
>
> # numResponses: 2
> # numEntries: 1

Thanks - is there a ticket open for this issue?

>
>
>>> [01/Aug/2012:10:39:42 -041800] conn=33 op=1 BIND dn="" method=sasl version=3 mech=GSSAPI
>>> [01/Aug/2012:10:39:42 -041800] conn=33 op=1 RESULT err=14 tag=97 nentries=0 etime=0, SASL bind in progress
>>> [01/Aug/2012:10:39:42 -041800] conn=33 op=2 BIND dn="" method=sasl version=3 mech=GSSAPI
>>> [01/Aug/2012:10:39:42 -041800] conn=33 op=2 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=admin,cn=users,cn=accounts,dc=XXXX,dc=XXXX,dc=ve"
>>> [01/Aug/2012:10:39:42 -041800] conn=33 op=3 SRCH base="dc=XXXX,dc=XXXX,dc=ve" scope=2 filter="(uid=bdteg01662)" attrs=ALL
>>> [01/Aug/2012:10:39:42 -041800] conn=33 op=3 RESULT err=0 tag=101 nentries=1 etime=0
>>>
>>>
>>>>>
>>>>>
>>>>>> Can you test connecting to the ldap server using ldapsearch -Y GSSAPI
>>>>>> (you need a kerberos ticket) and telling me if you experience any
>>>>>> delay ?
>>>>>> If you could run a bunch of searches in a loop and take a wireshark
>>>>>> trace that may help analyzing the timings and seeing if there is a
>>>>>> correlation.
>>>>>>
>>>>>> Simo.
>>>>>>
>>>>> _______________________________________________
>>>>> Freeipa-devel mailing list
>>>>> Freeipa-devel at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/freeipa-devel




More information about the Freeipa-devel mailing list