[Freeipa-users] Fwd: Marking subdomain offline

Chris Dagdigian dag at sonsorol.org
Thu Apr 6 18:39:02 UTC 2017


I see similar things in our environment where IPA is used as "glue" 
between AD Forests that have a 1-way trust relationship. We believe that 
the root cause has something to do with the 30+ domain controllers the 
IPA client tries to make contact with (in seemingly random order) across 
the AD Forest.  Very hard to reproduce but the "subdomain marked 
offline" problem is one we see often in the sssd logs. We think that 
there are some AD servers in our sprawling environment that we either 
can't reach properly over the network (firewalls, etc.) or are just 
plain not configured to talk properly to us.  Login success depends on 
hitting a happy domain controller.

We are VERY interested in the recent updates to IPA server that seem to 
indicate we can 'pin" clients to certain specific AD controllers and 
from my understanding we just need to wait until the SSSD software gets 
broad support for this feature as well. Once we can do that we plan to 
pin our clients to named controllers and see if that helps with any of 
the intermittent login problems.

One workaround we've started to use for power users is collecting public 
SSH keys and hosting them in the IPA server -- as long as IPA knows that 
the user "exists" in AD and has a roughly complete group membership list 
than logging in with SSH key instead of AD password bypasses the 
transient password checking failures and is very quick.

Chris

> mike at chinewalking.com <mailto:mike at chinewalking.com>
> April 6, 2017 at 1:21 PM
> Hi,
>
> My IPA<->AD trust setup experiences intermittent failures during login 
> events. The AD subdomain goes in an inactive/offline state and users 
> logging in are put into a 'delayed authentication' queue. Usually 
> logging in after a minute or so succeeds as the subdomain is reset and 
> the user is cached for following events. At all times getent/id and 
> kinit's are succesfull, even with a purged sssd cache.
> SRV records are correctly resolved, except for _kerberos-master.
>
> I have not been able to further troubleshoot the intermittent 
> failures. Traffic captures show no strange behaviour, yet the 
> sssd_domain log is clearly showing AD to be unreachable at times. All 
> AD servers are W2012 and DNS masking _ldap and _kerberos to single 
> nodes, factoring out any faulty Windows configs, so far has not had 
> any effect (Would it?).
>
> sssd's data_provider_fo.c :> be_fo_reset_svc() calls fo_get_service(), 
> which returns EOK. I'm not familiar yet with the variables at play, 
> would adding debug statements here reveal faults that may cause this?
>
> Any pointers are very much appreciated.
>
> Mike
>
>
> [sssd[be[unix.foo.local]]] [ipa_srv_ad_acct_lookup_step] (0x0400): 
> Looking up AD account
> [sssd[be[unix.foo.local]]] [ipa_srv_ad_acct_lookup_done] (0x0080): 
> Sudomain lookup failed, will try to reset sudomain..
> [sssd[be[unix.foo.local]]] [ipa_server_trusted_dom_setup_send] 
> (0x1000): Trust direction of subdom foo.local from forest foo.local 
> is: one-way inbound: local domain trusts the remote domain
> [sssd[be[unix.foo.local]]] [ipa_server_trusted_dom_setup_1way] 
> (0x0400): Will re-fetch keytab for foo.local
> [sssd[be[unix.foo.local]]] [ipa_getkeytab_send] (0x0400): Retrieving 
> keytab for UNIX$@FOO.local from ipa01.unix.foo.local into 
> /var/lib/sss/keytabs/foo.local.keytab6AXxWV using ccache 
> /var/lib/sss/db/ccache_UNIX.FOO.local
> [sssd[be[unix.foo.local]]] [child_handler_setup] (0x2000): Setting up 
> signal handler up for pid [6242]
> [sssd[be[unix.foo.local]]] [child_handler_setup] (0x2000): Signal 
> handler set up for pid [6242]
> [sssd[be[unix.foo.local]]] [sdap_process_result] (0x2000): Trace: 
> sh[0x7f71cd9ddb80], connected[1], ops[(nil)], ldap[0x7f71cd9e65a0]
> [sssd[be[unix.foo.local]]] [sdap_process_result] (0x2000): Trace: end 
> of ldap_result list
> [sssd[be[unix.foo.local]]] [ad_online_cb] (0x0400): The AD provider is 
> online
> [sssd[be[unix.foo.local]]] [be_ptask_online_cb] (0x0400): Back end is 
> online
> [sssd[be[unix.foo.local]]] [be_ptask_enable] (0x0080): Task 
> [Subdomains Refresh]: already enabled
> Keytab successfully retrieved and stored in: 
> /var/lib/sss/keytabs/foo.local.keytab6AXxWV
> [sssd[be[unix.foo.local]]] [child_sig_handler] (0x1000): Waiting for 
> child [6242].
> [sssd[be[unix.foo.local]]] [child_sig_handler] (0x0100): child [6242] 
> finished successfully.
> [sssd[be[unix.foo.local]]] [ipa_getkeytab_recv] (0x2000): 
> ipa-getkeytab status 0
> [sssd[be[unix.foo.local]]] [ipa_server_trust_1way_kt_done] (0x0400): 
> Keytab successfully retrieved to 
> /var/lib/sss/keytabs/foo.local.keytab6AXxWV
> [sssd[be[unix.foo.local]]] [ipa_server_trust_1way_kt_done] (0x2000): 
> Keytab renamed to /var/lib/sss/keytabs/foo.local.keytab
> [sssd[be[unix.foo.local]]] [ipa_server_trust_1way_kt_done] (0x0400): 
> Keytab /var/lib/sss/keytabs/foo.local.keytab6AXxWV contains the 
> expected principals
> [sssd[be[unix.foo.local]]] [ipa_server_trust_1way_kt_done] (0x0400): 
> Established trust context for foo.local
> [sssd[be[unix.foo.local]]] [unique_filename_destructor] (0x2000): 
> Unlinking [/var/lib/sss/keytabs/foo.local.keytab6AXxWV]
> [sssd[be[unix.foo.local]]] [unlink_dbg] (0x2000): File already 
> removed: [/var/lib/sss/keytabs/foo.local.keytab6AXxWV]
> [sssd[be[unix.foo.local]]] [ipa_srv_ad_acct_retried] (0x0400): 
> Sudomain re-set, will retry lookup
> [sssd[be[unix.foo.local]]] [be_fo_reset_svc] (0x1000): Resetting all 
> servers in service foo.local
> [sssd[be[unix.foo.local]]] [be_fo_reset_svc] (0x0080): Cannot retrieve 
> service [foo.local]
> [sssd[be[unix.foo.local]]] [ipa_srv_ad_acct_lookup_step] (0x0400): 
> Looking up AD account
> [sssd[be[unix.foo.local]]] [be_mark_dom_offline] (0x1000): Marking 
> subdomain foo.local offline
> [sssd[be[unix.foo.local]]] [ipa_srv_ad_acct_lookup_done] (0x0040): 
> ipa_get_*_acct request failed: [1432158270]: Subdomain is inactive.
> [sssd[be[unix.foo.local]]] [ipa_subdomain_account_done] (0x0040): 
> ipa_get_*_acct request failed: [1432158270]: Subdomain is inactive.
> [sssd[be[unix.foo.local]]] [dp_reply_std_set] (0x0080): DP Error is OK 
> on failed request?
> [sssd[be[unix.foo.local]]] [dp_req_done] (0x0400): DP Request [Account 
> #4]: Request handler finished [0]: Success
>




More information about the Freeipa-users mailing list