[Freeipa-users] Needs help understand this timeout issue

Troels Hansen th at casalogic.dk
Wed Feb 8 07:27:02 UTC 2017


No, ignore_group_members option is already set.
Tried setting and removing it on the client but didn't make a huge different.

Is a client have an completely empty cache, newly joined, empty /var/lig/sssd/db and mc cache or something, the IPA server ALWAYS asks the AD for group info, despite having a valid cache.

This leads to a delay of up to, or even more 5 minutes before a user can login. Being denied access at first, but logging again later with success.

If the client have a cache thats only expired its able to pull the account info from the cache on the IPA, without the IPA having to contact the AD.


----- On Feb 6, 2017, at 2:24 PM, Sullivan, Daniel [CRI] dsullivan2 at bsd.uchicago.edu wrote:

> Have you looked at the ignore_group_members option?  Maybe this is the problem
> you are seeing?
> 
> https://jhrozek.wordpress.com/2015/08/19/performance-tuning-sssd-for-large-ipa-ad-trust-deployments/
> 
> ==snip==
> 
> ignore_group_members
> Normally the most data-intensive operation is downloading the groups including
> their members. Usually, we are interested in what groups a user is a member of
> (id aduser at ad_domain) as the initial step rather than what members do specific
> groups include (getent group adgroup at ad_domain). Setting the
> ignore_group_members option to True makes all groups appear as empty, thus
> downloading only information about the group objects themselves and not their
> members, providing a significant performance boost. Please note that id
> aduser at ad_domain would still return all the correct groups!
> ==snip==
> 
> 
> Dan
> 
> On Feb 6, 2017, at 1:59 AM, Troels Hansen <th at casalogic.dk> wrote:
> 
> Hi
> 
> I'm aware of the anatomy of how the lookup is done, but I would suppose a valid
> cache on the IPA server would result in the cache from the IPA server being
> used?
> 
> I have been debugging this issue some more, and can confirm is the client have
> its sssd cache invalidated by "sss_cache -E" and the IPA server have a valid
> cache, when the client asks for the user, the IPA server still asks the AD for
> the entire group info?
> 
> I can see, that even though the cache is refreshed the attribute
> initgrExpireTimestamp (in the ldb cache) isn't updated.
> I have been unable to find out exactly what this controls?
> 
> lastUpdate and dataExpireTimestamp is updated to the time stamp of when the
> refresh ran.
> 
> 
> ----- On Feb 1, 2017, at 2:27 PM, Sullivan, Daniel [CRI]
> dsullivan2 at bsd.uchicago.edu wrote:
> 
> Have you checked to see if the user is expired in the cache, or if it is
> impacted by entry_cache_nowait_percentage (ref sssd.conf). The default entry
> timeout is only 90 minutes and entry_cache_nowait_percentage default is 50.
> 
> ldbsearch -H  /var/lib/sss/db/timestamps_xxx.xxx.uchicago.edu.ldb >
> ~/timestamps.txt
> ldbsearch -H  /var/lib/sss/db/cache_xxx.xxx.uchicago.edu.ldb > ~/cache.txt
> 
> Might be able to provide more info.
> 
> Again, I’d really familiarize yourself with the anatomy of an sssd lookup, if
> you get to know the diagram and steps 1-7 here it will be a big help to your
> question(s).
> https://jhrozek.wordpress.com/2015/03/11/anatomy-of-sssd-user-lookup/
> 
> Dan
> 
> 
> On Feb 1, 2017, at 4:32 AM, Troels Hansen <th at casalogic.dk> wrote:
> 
> From looking af at TCP dump, I can see that if a client requests a AD user from
> IPA, IPA does a full user lookup in AD, even though the IPA server have the
> user in local cache?
> 
> It looks like a single group generates a LOT of traffic to AD like:
> client -> IPA
> IPA -> client
> IPA -> AD
> AD -> IPA
> IPA -> AD
> IPA -> Second AD
> Second AD -> IPA
> IPA -> Second AD
> IPA -> AD
> AD -> IPA
> IPA -> AD
> IPA -> client
> client -> IPA
> IPA -> Client
> 
> Something similar continues for every group the user has.
> 
> Soo, I guess the question that I haven't been able to find documented anywhere.
> Isn't the SSSD group cache on the IPA servers supposed to be used then a sssd
> client requests a user?
> 
> 
> 
> ----- On Feb 1, 2017, at 9:53 AM, Troels Hansen th at casalogic.dk wrote:
> 
> Hmm, suspect its happening on the server...... thous I haven't been able to
> pinpoint a log entry that confirms my suspecting.
> 
> I have pinpointed the timeout to happen after 58 seconds after completely
> removing the SSSD cache and restaring SSSD, which leads me to think my issue is
> related to ldap_enumaration_search_timeout which defaults to 60.
> 
> rm -fr /var/lib/sss/{mc,db}/* && systemctl restart sssd && time id shja
> 
> However, I'm unable to crank it up on the server as it seems to be adjusted Down
> to 60 Again?
> 
> Wed Feb  1 09:40:56 2017) [sssd[be[lx.dr.dk]]] [dp_get_options] (0x0400): Option
> ldap_enumeration_search_timeout has value 120
> (Wed Feb  1 09:40:56 2017) [sssd[be[lx.dr.dk]]] [dp_copy_options_ex] (0x0400):
> Option ldap_enumeration_search_timeout has value 60
> (Wed Feb  1 09:40:56 2017) [sssd[be[lx.dr.dk]]] [dp_copy_options_ex] (0x0400):
> Option ldap_enumeration_search_timeout has value 60
> 
> LDAP seems speedy enough, not timeouts while querying it manually while a client
> is doing a user lookup.
> 
> ----- On Jan 30, 2017, at 6:06 PM, Sullivan, Daniel [CRI]
> dsullivan2 at bsd.uchicago.edu wrote:
> 
> 
> If the timeout is occurring on the server, I would start by increasing one or
> both of these values:
> 
> ldap_opt_timeout
> ldap_search_timeout
> 
> If that doesn’t work I’d take look to see if the 389 server is under high load
> when you are performing this operation.  The easiest way I have found to do
> this is to just execute an LDAP query directly against the IPA server when you
> are performing an id lookup, for example:
> 
> ldapsearch -D "cn=Directory Manager" -w <pw> -s base -b "cn=config"
> "(objectclass=*)”
> 
> If the LDAP server is not responsive you probably need to increase the number of
> worker threads for 389ds.  Also, you might want to disable referrals, check out
> the man pages for this;
> 
> ldap_referrals = false
> 
> Also, FWIW, if you crank up debug logging on the sssd client, you should be able
> to see the amount of seconds of timeout assigned to the operation, and witness
> the fact that the operation is actually timing out by inspecting the logs
> themselves.  The logs can get a little verbose but the data is there.
> 
> 
> --
> Manage your subscription for the Freeipa-users mailing list:
> https://www.redhat.com/mailman/listinfo/freeipa-users
> Go to http://freeipa.org for more info on the project
> 
> --
> Med venlig hilsen
> 
> Troels Hansen
> 
> Systemkonsulent
> 
> Casalogic A/S
> 
> 
> T (+45) 70 20 10 63
> 
> M (+45) 22 43 71 57
> 
> Red Hat, SUSE, VMware, Citrix, Novell, Yellowfin BI, EnterpriseDB, Sophos og
> meget mere.
> 
> --
> Manage your subscription for the Freeipa-users mailing list:
> https://www.redhat.com/mailman/listinfo/freeipa-users
> Go to http://freeipa.org for more info on the project
> 
> --
> Med venlig hilsen
> 
> Troels Hansen
> 
> Systemkonsulent
> 
> Casalogic A/S
> 
> 
> T (+45) 70 20 10 63
> 
> M (+45) 22 43 71 57
> 
> Red Hat, SUSE, VMware, Citrix, Novell, Yellowfin BI, EnterpriseDB, Sophos og
> meget mere.

-- 
Med venlig hilsen 

Troels Hansen 

Systemkonsulent 

Casalogic A/S 


T (+45) 70 20 10 63 

M (+45) 22 43 71 57 

Red Hat, SUSE, VMware, Citrix, Novell, Yellowfin BI, EnterpriseDB, Sophos og meget mere.




More information about the Freeipa-users mailing list