<div dir="ltr"><div><div><div>SELinux is disabled, updated to 1.14.1 today. <br><br></div>This is the first crash in weeks, so we aren't that phased, although we'd love to know it wont happen again - the servers are part of a cluster that executes automated tasks as the data comes off genome sequencing machines - clinical medical analyses that is important for the patients & etc. <br><br></div>Cheers<br></div>L.<br></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div>------<br>The most dangerous phrase in the language is, "We've always done it this way."<br><br>- Grace Hopper<br></div></div></div></div> <br><div class="gmail_quote">On 12 September 2016 at 20:28, Lukas Slebodnik <span dir="ltr"><<a href="mailto:lslebodn@redhat.com" target="_blank">lslebodn@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On (12/09/16 11:09), Lachlan Musicman wrote:<br> >We saw another sssd crash on the weekend (well, Friday night).<br> ><br> >Centos 7, sssd 1.14.0 from COPR<br> ><br> </span>Please upgrade to 1.14.1 from copr.<br> <span class=""><br> >Everything has worked fine for over a month until Friday.<br> ><br> >According to the log sssd_nss on the host in question:<br> ><br> > - at about 16:18, watchdog_handler killed a process for a timer overflow.<br> > - there is some flopping about as nss/sssd tries to reconnect<br> > - at 16:19:12 we see this:<br> ><br> >(Fri Sep 9 16:19:12 2016) [sssd[nss]] [sbus_dispatch] (0x0400): SBUS is<br> >reconnecting. Deferring.<br> ><br> > - Which continues until 16:20:56<br> ><br> >(Fri Sep 9 16:20:56 2016) [sssd[nss]] [sbus_dispatch] (0x0400): SBUS is<br> >reconnecting. Deferring.<br> ><br> >Note that there are 9,573,091 lines of this, at about 80,000 msgs per<br> >second.<br> ><br> > - nss seems to stumble back to life at this point (there are no logs on<br> >the freeipa server unfortunately)<br> ><br> > - at every 15 min interval we see this (I think this might be zabbix<br> >polling sssd):<br> ><br> >(Fri Sep 9 18:30:01 2016) [sssd[nss]] [get_client_cred] (0x0020):<br> >SELINUX_getpeercon failed [-1][Unknown error -1].<br> </span>What is a state of SELinux on your machine?<br> Please share output of "sestatus"<br> <span class="HOEnZb"><font color="#888888"><br> LS<br> </font></span></blockquote></div><br></div>