<div dir="ltr">Sumit,<div><br></div><div>We found a resolution for this and I'm dropping it here for posterity.  After some digging, it turns out that our ipa server and ipa replica were returning different IPs for systems in the environment in DNS requests (one returned internal results, one returned external results).  </div><div><br></div><div>After resolving this our intermittent connectivity issue went away.  So it seems that in some cases, the incorrect IP was being returned for LDAP requests.  </div><div><br></div><div>One additional item found here, it seems that the timeout to resolve an address (from the sssd logs) is 6 seconds.  Can this be raised?</div><div><br></div><div>Thanks,</div><div><br></div><div>Jeff</div></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature"><div dir="ltr"><span style="font-family:Arial,sans-serif">Jeff Hallyburton</span><span style="font-size:10pt;font-family:Arial,sans-serif"><br></span><span style="font-size:10pt;font-family:Arial,sans-serif">Strategic Systems Engineer<br><span style="background-image:initial;background-repeat:initial">Bloomip Inc.</span></span><span><span style="font-size:10pt;font-family:Arial,sans-serif"><br><span style="background-image:initial;background-repeat:initial">Web: </span></span><a href="http://www.bloomip.com/" style="color:rgb(17,85,204)" target="_blank"><span style="font-size:10pt;font-family:Arial,sans-serif;background-image:initial;background-repeat:initial">http://www.bloomip.com</span></a><span style="font-size:10pt;font-family:Arial,sans-serif"><br><br><span style="background-image:initial;background-repeat:initial">Engineering Support: </span></span><a href="mailto:support@bloomip.com" style="color:rgb(17,85,204)" target="_blank"><span style="font-size:10pt;font-family:Arial,sans-serif;background-image:initial;background-repeat:initial">support@bloomip.com</span></a><span style="font-size:10pt;font-family:Arial,sans-serif"><br><span style="background-image:initial;background-repeat:initial">Billing Support: </span></span><a href="mailto:billing@bloomip.com" style="color:rgb(17,85,204)" target="_blank"><span style="font-size:10pt;font-family:Arial,sans-serif;background-image:initial;background-repeat:initial">billing@bloomip.com</span></a><span style="font-size:10pt;font-family:Arial,sans-serif"><br><span style="background-image:initial;background-repeat:initial">Customer Support Portal:  </span></span><a href="http://my.bloomip.com/" style="color:rgb(17,85,204)" target="_blank"><span style="font-size:10pt;font-family:Arial,sans-serif;background-image:initial;background-repeat:initial">https://my.bloomip.com</span></a></span><span style="font-size:10pt;font-family:Arial,sans-serif"><br></span></div></div></div>
<br><div class="gmail_quote">On Thu, Apr 21, 2016 at 7:47 AM, Sumit Bose <span dir="ltr"><<a href="mailto:sbose@redhat.com" target="_blank">sbose@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Wed, Apr 20, 2016 at 02:18:28PM -0400, Jeff Hallyburton wrote:<br>
> Sumit,<br>
><br>
> Raised the debug level to 10 and let it run for about 24 hours.  Uploading<br>
</span><span class="">> the last 2000~ lines of the sssd_domain.com.log.  Thanks for your help!<br>
<br>
</span>Can you send the related krb5_child log file as well?<br>
<br>
bye,<br>
Sumit<br>
<span class=""><br>
><br>
> <a href="https://pastebin.com/MD6N1Dj7" rel="noreferrer" target="_blank">https://pastebin.com/MD6N1Dj7</a><br>
><br>
> Jeff Hallyburton<br>
> Strategic Systems Engineer<br>
> Bloomip Inc.<br>
> Web: <a href="http://www.bloomip.com" rel="noreferrer" target="_blank">http://www.bloomip.com</a><br>
><br>
> Engineering Support: <a href="mailto:support@bloomip.com">support@bloomip.com</a><br>
> Billing Support: <a href="mailto:billing@bloomip.com">billing@bloomip.com</a><br>
</span>> Customer Support Portal:  <a href="https://my.bloomip.com" rel="noreferrer" target="_blank">https://my.bloomip.com</a> <<a href="http://my.bloomip.com/" rel="noreferrer" target="_blank">http://my.bloomip.com/</a>><br>
<span class="">><br>
> On Tue, Apr 19, 2016 at 1:14 PM, Jeff Hallyburton <<br>
> <a href="mailto:jeff.hallyburton@bloomip.com">jeff.hallyburton@bloomip.com</a>> wrote:<br>
><br>
> > Sumit,<br>
> ><br>
> > Raised the debug level to 10 and let it run for about 24 hours.  Uploading<br>
> > the full sssd_domain.com.log.  Thanks for your help!<br>
> ><br>
> > Jeff<br>
> ><br>
> > Jeff Hallyburton<br>
> > Strategic Systems Engineer<br>
> > Bloomip Inc.<br>
> > Web: <a href="http://www.bloomip.com" rel="noreferrer" target="_blank">http://www.bloomip.com</a><br>
> ><br>
> > Engineering Support: <a href="mailto:support@bloomip.com">support@bloomip.com</a><br>
> > Billing Support: <a href="mailto:billing@bloomip.com">billing@bloomip.com</a><br>
</span>> > Customer Support Portal:  <a href="https://my.bloomip.com" rel="noreferrer" target="_blank">https://my.bloomip.com</a> <<a href="http://my.bloomip.com/" rel="noreferrer" target="_blank">http://my.bloomip.com/</a>><br>
<div class="HOEnZb"><div class="h5">> ><br>
> > On Mon, Apr 18, 2016 at 10:58 AM, Sumit Bose <<a href="mailto:sbose@redhat.com">sbose@redhat.com</a>> wrote:<br>
> ><br>
> >> On Fri, Apr 15, 2016 at 04:47:42PM -0400, Jeff Hallyburton wrote:<br>
> >> > After setting debug_level=8, this is what I see in the sssd_domain_log:<br>
> >><br>
> >> Unfortunately the domain log and the krb5_child log do not relate to<br>
> >> each other.<br>
> >><br>
> >> ><br>
> >> > (Fri Apr 15 20:10:46 2016) [sssd[be[<a href="http://example.com" rel="noreferrer" target="_blank">example.com</a>]]]<br>
> >> [child_handler_setup]<br>
> >> > (0x2000): Setting up signal handler up for pid [32382]<br>
> >> ><br>
> >><br>
> >> ....<br>
> >><br>
> >> ><br>
> >> > (Fri Apr 15 20:32:47 2016) [[sssd[krb5_child[32731]]]] [k5c_setup_fast]<br>
> >> > (0x0100): SSSD_KRB5_FAST_PRINCIPAL is set to [host/<br>
> >> > <a href="mailto:jump02.west-2.production.example.com@EXAMPLE.COM">jump02.west-2.production.example.com@EXAMPLE.COM</a>]<br>
> >> ><br>
> >><br>
> >> ...<br>
> >><br>
> >> > (Fri Apr 15 20:32:47 2016) [[sssd[krb5_child[32731]]]]<br>
> >> [get_and_save_tgt]<br>
> >> > (0x0400): krb5_get_init_creds_password returned [-1765328324} during<br>
> >> > pre-auth.<br>
> >> ><br>
> >> ><br>
> >> > Can you shed any light on this?<br>
> >> ><br>
> >><br>
> >> In the domain log the child with the pid 32382 is started to run a<br>
> >> pre-authentication request. The request is needed to find out which kind<br>
> >> of authentication types are available for the user, e.g. password or<br>
> >> 2-factor authentication with the OTP token. The request in the child<br>
> >> with the PID 32731 looks like a real authentication request with returns<br>
> >> with an error code -1765328324 which just means 'Generic error' but<br>
> >> might have cause SSSD to go offline.<br>
> >><br>
> >> I would like to ask you to run the test again with debug_level=10 in the<br>
> >> [domain/...] section of sssd.conf which would enable some low level<br>
> >> Kerberos tracing messages which might help to understand what kind of<br>
> >> 'Generic error' was hit here. Additionally I would like ask you to send<br>
> >> the full log files as attachment or in an archive which would hep be to<br>
> >> better navigate through them.<br>
> >><br>
> >> bye,<br>
> >> Sumit<br>
> >><br>
> ><br>
> ><br>
</div></div></blockquote></div><br></div>