[Freeipa-users] IPA HBAC access using SSSD for user in trusted AD domain (RHEL 6.8)

Sullivan, Daniel [AAA] dsullivan2 at bsd.uchicago.edu
Fri Jul 15 14:04:43 UTC 2016


Hi,

Changing pam_id_timeout = 60 and krb5_auth_timeout = 60 on the client in conjunction with enabling tmpfs caching for /var/lib/sss/db on the DC appears to have helped significantly.  This issue is becoming much more difficult to reproduce, although I can still reproduce it.  Now, it appears that rapid successive invocations of the id command will yield a returned record. The timeout for the output specified below (i.e. the time it took the first command to return) was definitely less than 60 seconds, probably 10-20.  I am going to look into the tuning options for sssd, and would of course be interested in any advisement you could provide this regard.  AFAIK this issue currently only impacts users with a large number of groups (in fact I have only been able to cause this issue one one user after tuning as described above).  I am going to script a test and do a lookup for every single ID Override user in our environment to see what kind of a hit rate I get.  I’ll report back.  Thank you again for your help.

[root at cri-kcriwebgdp1 log]# id rcrist

id: rcrist: No such user
[root at cri-kcriwebgdp1 log]# id rcrist
uid=339748142(rcrist) gid=339748142(rcrist) groups=339748142(rcrist),339801232(cri-aaa_static_hosting),788635799(adm-sde-clients),788600520(group policy creator owners),788602710(bsd exchange view only administrators),339792922(cri-all_users),788659064(aaa-static_hosting_groups),788601114(bsd$ dns read),788609545(adm-trackitusers),339806103(cri-ciscat),788609528(adm-bsd-mis),788619855(adm-oua-dl),788615498(adm-himss),788637726(adm-dstmlist-dl),788600513(domain users),788601110(bsd$ all oua),788654299(cri-all_groups),788658170(ocr-sharepoint ocr members),788619946(adm-trackitreports),788638566(ocr-coi),788633650(#ocr-office-dl),788644425(ocr velos email),788609542(adm-testgroup1),788638733(ocr-dfc-users),788665477(med-section_shares-clinical trials (only)),788609532(adm-bsdis-print),788634332(ocr-clinical research),788609546(adm-tss),788658806(ocr-hiro),788672525(ocr-bsdvpn-allow),788640103(adm shpt srp contributors),788659092(ocr-sharepoint-velosupgrade),788639053(ocr-velos-tickets),788610719(adm-premigration-proofpoint),788635798(adm-sde-techs),788635657(adm-www-clinres),788653680(ocr-email-management),788663575(ocr-bsdirb),788658171(ocr-sharepoint irb members),788650124(ocr it),788609567(ors-teleform),788653595(ocr$ oua),788609341(ic),788646237(adm shpt ocr visitors),788609544(adm-trackittech),788671562(ocr-ocrepic),788652940(dma management)

Dan


On Jul 15, 2016, at 8:22 AM, Sullivan, Daniel [AAA] <dsullivan2 at bsd.uchicago.edu<mailto:dsullivan2 at bsd.uchicago.edu>> wrote:

Jakub,

Sure, no problem, I am happy to provide the output that you are requesting.  Thank you for taking the time to help me.

To answer your question, no record is returned (not missing groups). For example, the output of the failure was:

[root at cri-kcriwebgdp1 log]# id mjarsulic
id: mjarsulic: No such user

As per your request I have attached domain and nss logs for a lookup on the user ‘spott’ (command invoked ‘id spott’ on the client). (immediately after executing 'sss_cache -E; service sssd stop ; rm -rf /var/log/sssd/*; service sssd start;’ on the client):

IPA - https://gist.github.com/dsulli99/4e45faa39474b9131be811e4a0779c40
NSS - https://gist.github.com/dsulli99/e2e10da34ff860ec15e56ea521eb8315

Not every record fails, and the behavior is inconsistent between lookups (i.e. sometimes a user will lookup correctly, sometimes it will not), but it appears that in some situations a timeout is occurring in the nss logs (not in the failure above).   In these situations it looks to me like the query is dispatched to the DC, and the lookup times out.  If I wait a little bit and perform the lookup on the same user again,  the record is returned (presumably because the DC eventually resolved and cached the query?).  We are migrating from CentrifyDC and have loaded 2000+ custom ID overrides into our Default Trust ID View; perhaps we will need to implement the tempfs caching for the /var/lib/sss/db on the DC as described in your performance tuning document (https://jhrozek.wordpress.com/2015/08/19/performance-tuning-sssd-for-large-ipa-ad-trust-deployments/).  These timeouts look like:

(Fri Jul 15 07:21:04 2016) [sssd[nss]] [get_dp_name_and_id] (0x0400): Not a LOCAL view, continuing with provided values.
(Fri Jul 15 07:21:04 2016) [sssd[nss]] [sss_dp_issue_request] (0x0400): Issuing request for [0x41e750:1:bson at bsdad.uchicago.edu<mailto:bson at bsdad.uchicago.edu><mailto:bson at bsdad.uchicago.edu>@bsdad.uchicago.edu<http://bsdad.uchicago.edu>]
(Fri Jul 15 07:21:04 2016) [sssd[nss]] [sss_dp_get_account_msg] (0x0400): Creating request for [bsdad.uchicago.edu<http://bsdad.uchicago.edu><http://bsdad.uchicago.edu>][0x1][BE_REQ_USER][1][name=bson at bsdad.uchicago.edu<mailto:name=bson at bsdad.uchicago.edu><mailto:name=bson at bsdad.uchicago.edu>:-]
(Fri Jul 15 07:21:04 2016) [sssd[nss]] [sbus_add_timeout] (0x2000): 0x1fa9020
(Fri Jul 15 07:21:04 2016) [sssd[nss]] [sss_dp_internal_get_send] (0x0400): Entering request [0x41e750:1:bson at bsdad.uchicago.edu<mailto:bson at bsdad.uchicago.edu><mailto:bson at bsdad.uchicago.edu>@bsdad.uchicago.edu<http://bsdad.uchicago.edu>]
(Fri Jul 15 07:21:17 2016) [sssd[nss]] [sbus_remove_timeout] (0x2000): 0x1fa9020
(Fri Jul 15 07:21:17 2016) [sssd[nss]] [sbus_dispatch] (0x4000): dbus conn: 0x1fa0730
(Fri Jul 15 07:21:17 2016) [sssd[nss]] [sbus_dispatch] (0x4000): Dispatching.
(Fri Jul 15 07:21:17 2016) [sssd[nss]] [sss_dp_get_reply] (0x1000): Got reply from Data Provider - DP error code: 3 errno: 110 error message: Connection timed out
(Fri Jul 15 07:21:17 2016) [sssd[nss]] [nss_cmd_getby_dp_callback] (0x0040): Unable to get information from Data Provider
Error: 3, 110, Connection timed out
Will try to return what we have in cache
(Fri Jul 15 07:21:17 2016) [sssd[nss]] [sss_dp_req_destructor] (0x0400): Deleting request: [0x41e750:1:bson at bsdad.uchicago.edu<mailto:bson at bsdad.uchicago.edu><mailto:bson at bsdad.uchicago.edu>@bsdad.uchicago.edu<http://bsdad.uchicago.edu>]
(Fri Jul 15 07:21:17 2016) [sssd[nss]] [reset_idle_timer] (0x4000): Idle timer re-set for client [0x1fa7fc0][22]
(Fri Jul 15 07:21:17 2016) [sssd[nss]] [reset_idle_timer] (0x4000): Idle timer re-set for client [0x1fa7fc0][22]
(Fri Jul 15 07:21:17 2016) [sssd[nss]] [client_recv] (0x0200): Client disconnected!
(Fri Jul 15 07:21:17 2016) [sssd[nss]] [client_close_fn] (0x2000): Terminated client [0x1fa7fc0][22]

I’m going to implement tmpfs caching on the DC, hopefully this will address at least a subset of these lookup failures.  I’ll report back with my findings.

Thank you again for your help.

Best,

Dan Sullivan




On Jul 15, 2016, at 7:12 AM, Jakub Hrozek <jhrozek at redhat.com<mailto:jhrozek at redhat.com><mailto:jhrozek at redhat.com>> wrote:

On Fri, Jul 15, 2016 at 12:00:56PM +0000, Sullivan, Daniel [AAA] wrote:
Lukas,

Thank you for your reply and inquiry.

First, to answer your question; yes, we have been using the default_domain_suffix for some time.  I am not sure what you mean by previously, but it is currently implemented and has been implemented prior to our 1.13 -> 1.14 upgrade.

And yes, I am assessing a possible software regression at the
current moment. It might be related to the default_domain_suffix
you are inquiring about.  Basically I am getting inconsistent
results on invocation of the id command with specifying the username
as ‘username’ or ‘username at fqdn’ on a client running 1.14
against a DC running 1.13 (i.e. no way to reliably invoke id against a
trusted domain account).  Sometimes the command will return a result,
and sometimes it will not.

No result or missing groups?

Looking at nss debug logs it appears that
a duplicate fqdn is being appended to the nss query as show here (as
@bsdad.uchicago.edu at bsdad.uchicago.edu<mailto:bsdad.uchicago.edu at bsdad.uchicago.edu><mailto:bsdad.uchicago.edu at bsdad.uchicago.edu><mailto:bsdad.uchicago.edu at bsdad.uchicago.edu>).
This lookup fails.

Yes, this is wrong, can you send me the full NSS and domain logs please?

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


********************************************************************************
This e-mail is intended only for the use of the individual or entity to which
it is addressed and may contain information that is privileged and confidential.
If the reader of this e-mail message is not the intended recipient, you are
hereby notified that any dissemination, distribution or copying of this
communication is prohibited. If you have received this e-mail in error, please
notify the sender and destroy all copies of the transmittal.

Thank you
University of Chicago Medicine and Biological Sciences
********************************************************************************

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


********************************************************************************
This e-mail is intended only for the use of the individual or entity to which
it is addressed and may contain information that is privileged and confidential.
If the reader of this e-mail message is not the intended recipient, you are 
hereby notified that any dissemination, distribution or copying of this
communication is prohibited. If you have received this e-mail in error, please 
notify the sender and destroy all copies of the transmittal. 

Thank you
University of Chicago Medicine and Biological Sciences 
********************************************************************************




More information about the Freeipa-users mailing list