[Freeipa-users] ipa-replica-install IPA startup timing issue
thierry bordaz
tbordaz at redhat.com
Wed Mar 16 15:24:48 UTC 2016
Hello Daryl,
I can reproduce locally the slow DS startup (due to slapi-nis
priming). In fact the version I was using had not the slapi-nis fix
to differ the priming.
I failed to reproduce the intensive load on DS when krb5kdc startup.
Looking at yours logs, we can see that krb5kdc startup triggers a
set of requests during 3s up to 8s. The logs are looking like (note
the etime can go up to 2s):
[10/Mar/2016:14:20:35 -0600] conn=40 fd=87 slot=87 connection from
local to /var/run/slapd-UOFMT1.socket
[10/Mar/2016:14:20:36 -0600] conn=40 AUTOBIND dn="cn=Directory Manager"
[10/Mar/2016:14:20:36 -0600] conn=40 op=0 BIND dn="cn=Directory
Manager" method=sasl version=3 mech=EXTERNAL
[10/Mar/2016:14:20:36 -0600] conn=40 op=0 RESULT err=0 tag=97
nentries=0 *etime=1* dn="cn=Directory Manager"
[10/Mar/2016:14:20:36 -0600] conn=40 op=1 SRCH
base="cn=UOFMT1,cn=kerberos,dc=uofmt1" scope=0
filter="(objectClass=*)" attrs=ALL
[10/Mar/2016:14:20:36 -0600] conn=40 op=1 RESULT err=0 tag=101
nentries=1 etime=0
[10/Mar/2016:14:20:36 -0600] conn=40 op=2 SRCH
base="cn=ipaConfig,cn=etc,dc=uofmt1" scope=0
filter="(objectClass=*)" attrs="ipaConfigString ipaKrbAuthzData
ipaUserAuthType"
[10/Mar/2016:14:20:36 -0600] conn=40 op=2 RESULT err=0 tag=101
nentries=1 etime=0
[10/Mar/2016:14:20:36 -0600] conn=40 op=3 SRCH base="dc=uofmt1"
scope=2 filter="(objectClass=ipaNTDomainAttrs)" attrs="ipaNTFlatName
ipaNTFallbackPrimaryGroup ipaNTSecurityIdentifier"
[10/Mar/2016:14:20:37 -0600] conn=40 op=3 RESULT err=0 tag=101
nentries=0 *etime=1*
[10/Mar/2016:14:20:37 -0600] conn=40 op=4 SRCH
base="cn=UOFMT1,cn=kerberos,dc=uofmt1" scope=0 filter="(krbMKey=*)"
attrs="krbMKey"
[10/Mar/2016:14:20:37 -0600] conn=40 op=4 RESULT err=0 tag=101
nentries=1 etime=0
[10/Mar/2016:14:20:37 -0600] conn=40 op=5 SRCH base="dc=uofmt1"
scope=2
filter="(&(|(objectClass=krbprincipalaux)(objectClass=krbprincipal))(krbPrincipalName=K/M at UOFMT1))"
attrs="krbPrincipalName krbCanonicalName ipaKrbPrincipalAlias
krbUPEnabled krbPrincipalKey krbTicketPolicyReference
krbPrincipalExpiration krbPasswordExpiration krbPwdPolicyReference
krbPrincipalType krbPwdHistory krbLastPwdChange krbPrincipalAliases
krbLastSuccessfulAuth krbLastFailedAuth krbLoginFailedCount
krbExtraData krbLastAdminUnlock krbObjectReferences krbTicketFlags
krbMaxTicketLife krbMaxRenewableAge nsAccountLock passwordHistory
ipaKrbAuthzData ipaUserAuthType ipatokenRadiusConfigLink objectClass"
[10/Mar/2016:14:20:38 -0600] conn=40 op=5 RESULT err=0 tag=101
nentries=1 *etime=1*
[10/Mar/2016:14:20:39 -0600] conn=40 op=6 UNBIND
[10/Mar/2016:14:20:39 -0600] conn=40 op=6 fd=87 closed - U1
I think the request op=3 (SRCH base="dc=uofmt1" scope=2
filter="(objectClass=ipaNTDomainAttrs)") is slow also because of
slapi-nis. In fact it is indexed and returns 0 entry. So only
plugins can create this high etime.
An improvement in slapi-nis makes its search callback noop when it
comes from krb and I am running this improvement.
In conclusion I think both slow DS startup and KRB5 startup are
fixed in RHEL 7.
thanks
theirry
On 03/14/2016 08:46 PM, Daryl Fonseca-Holt wrote:
> Hello Thierry,
>
> Attached is the pstacks from only the final DS restart. I don't think
> they will show the whole picture.
>
> According to the debug log /var/log/ipareplica-install.log (attached)
> the start of the krb5kdc.service (19:13:16Z) is successful, but the
> krb5kdc log (attach) shows it is unable to fetch the master K/M at
> 14:31:31CDT (-5hour offset). This is when the install log shows
> kadmind failing.
>
> In my experience with the master observing top there are two intense
> times for ns-slapd-<instance>. The first when it start, of course, and
> the second when krb5kdc starts. I assume this is because krb5kdc must
> get it's configuration and data from the same DS. krb5kdc fails but
> the ipareplica-install script isn't aware of it. Finally
> kadmin.service tries to access krb5kdc and finds that it is dead.
>
> Please note these logs are with Schema Compatability and NIS plugins
> turned off per the other e-mail from Alexander.
>
> I've noticed on a running master I can prevent this type of failure by
> manually starting dirsrv (systemctl start dirsrv@<instance>.service),
> watch top until all threads of ns-slapd have settled, then systemctl
> start krb5kdc.service, again watching top until ns-slapd threads have
> settled down before systemctl start kadmin.service. This kind of
> manual intervention is is not possible when running the
> ipareplica-install script.
>
> I will look into introducing a delay at the completion of the dirsrv
> and krb5kdc systemd units and see if I can accommodate
> ipareplica-install. Just as an experiment for now. I need to advance
> the project into High Availability testing but cannot do so without a
> functioning replica.
>
> Regards, Daryl
>
> On 03/14/16 09:20, thierry bordaz wrote:
>> Hi Daryl,
>>
>> Thanks for all the data. I will look at the pstacks. A first look
>> shows that you capture import, bind... so may be a complete
>> ipa-replica-install session.
>> I will try to retrieve the specific startup time to see what was
>> going on at that time.
>> If you have the time to monitor only startup, it will help me
>> shrinking the set of pstacks.
>> Startup of DS last > 1min. If you may start DS and as soon as the
>> ns-slapd process is launched, do regular pstacks. Then when you are
>> able to send a simple ldapsearch (ldapsearch -x -b "" -s base), you
>> may stop taking pstacks.
>>
>> thanks
>> thierry
>>
>> On 03/14/2016 03:06 PM, Daryl Fonseca-Holt wrote:
>>> Hi Thierry,
>>>
>>> I moved the old logs into a subdirectory called try1. I did the
>>> recommended ipa-server-install --uninstall. Tried the replica
>>> install again. Failed during kadmind start like the previous time.
>>>
>>> The log from ipa-replica-install (with -d) is at
>>> http://home.cc.umanitoba.ca/~fonsecah/ipa/ipareplica-install.log
>>> The console script (mostly the same as the log but with my entries)
>>> is at
>>> http://home.cc.umanitoba.ca/~fonsecah/ipa/ipa-replica-install.console
>>> The 5 second pstacks are at
>>> http://home.cc.umanitoba.ca/~fonsecah/ipa/slapd-pstacks.console
>>>
>>> Thanks, Daryl
>>>
>>>
>>> On 03/11/16 02:40, thierry bordaz wrote:
>>>> Hello Deryl,
>>>>
>>>> My understanding is that ns-slapd is first slow to startup.
>>>> Then when krb5kdc is starting it may load ns-slapd.
>>>>
>>>> We identified krb5kdc may be impacted by the number of users
>>>> accounts.
>>>> From the ns-slapd errors log it is not clear why it is so slow
>>>> to start.
>>>>
>>>> Would you provide the ns-slapd access logs from that period.
>>>> Also in order to know where ns-slapd is spending time, it would
>>>> really help if you can get regular (each 5s) pstacks (with
>>>> 389-ds-debuginfo), during DS startup and then later during
>>>> krb5kdc startup.
>>>>
>>>> best regards
>>>> thierry
>>>>
>>>>
>>>> On 03/10/2016 11:10 PM, Daryl Fonseca-Holt wrote:
>>>>> Environment:
>>>>> RHEL 7.2
>>>>> IPA 4.2.0-15
>>>>> nss 3.19.1-19
>>>>> 389-ds-base 1.3.4.0-26
>>>>> sssd 1.13.0-40
>>>>>
>>>>>
>>>>> I've encountered this problem in IPA 3.0.0 but hoped it was
>>>>> addressed in 4.2.0.
>>>>>
>>>>> Trying to set up a replica of a master with 150,000+ user
>>>>> accounts, NIS and Schema Compatability enabled on the master.
>>>>>
>>>>> During ipa-replica-install it attempts to start IPA. dirsrv
>>>>> starts, krb5kdc starts, but then kadmind fails because krb5kdc has
>>>>> gone missing.
>>>>>
>>>>> This happens during restart of IPA in version 3.0.0 too. There it
>>>>> can be overcome by manually starting each component of IPA _but_
>>>>> waiting until ns-slapd-<instance> has settled down (as seen from
>>>>> top) before starting krb5kdc. I also think that the startup of
>>>>> krb5kdc loads the LDAP instance quite a bit.
>>>>>
>>>>> There is a problem in the startup logic where dirsrv is so busy
>>>>> that even though krb5kdc successfully starts and allows the kadmin
>>>>> to begin kdb5kdc is not really able to do its duties.
>>>>>
>>>>> I'm reporting this since there must be some way to delay the start
>>>>> of krb5kdc and then kadmind until ns-slapd-<instance> is really
>>>>> open for business.
>>>>>
>>>>> # systemctl status krb5kdc.service
>>>>> ● krb5kdc.service - Kerberos 5 KDC
>>>>> Loaded: loaded (/usr/lib/systemd/system/krb5kdc.service;
>>>>> disabled; vendor preset: disabled)
>>>>> Active: inactive (dead)
>>>>>
>>>>> Mar 10 14:19:13 jutta.cc.umanitoba.ca systemd[1]: Stopped Kerberos
>>>>> 5 KDC.
>>>>> Mar 10 14:20:36 jutta.cc.umanitoba.ca systemd[1]: Starting
>>>>> Kerberos 5 KDC...
>>>>> Mar 10 14:20:39 jutta.cc.umanitoba.ca systemd[1]: Started Kerberos
>>>>> 5 KDC.
>>>>>
>>>>> # systemctl status krb5kdc.service
>>>>> ● krb5kdc.service - Kerberos 5 KDC
>>>>> Loaded: loaded (/usr/lib/systemd/system/krb5kdc.service;
>>>>> disabled; vendor preset: disabled)
>>>>> Active: inactive (dead)
>>>>>
>>>>> Mar 10 14:19:13 jutta.cc.umanitoba.ca systemd[1]: Stopped Kerberos
>>>>> 5 KDC.
>>>>> Mar 10 14:20:36 jutta.cc.umanitoba.ca systemd[1]: Starting
>>>>> Kerberos 5 KDC...
>>>>> Mar 10 14:20:39 jutta.cc.umanitoba.ca systemd[1]: Started Kerberos
>>>>> 5 KDC.
>>>>>
>>>>> journalctl -xe was stale by the time I got to it so I've attached
>>>>> /var/log/messages instead.
>>>>>
>>>>> The log from ipa-replica-install (with -d) is at
>>>>> http://home.cc.umanitoba.ca/~fonsecah/ipa/ipareplica-install.log
>>>>> The console script (mostly the same as the log but with my
>>>>> entries) is at
>>>>> http://home.cc.umanitoba.ca/~fonsecah/ipa/ipa-replica-install.console
>>>>> The /var/log/dirsrv/ns-slapd-<instance> access log is at
>>>>> http://home.cc.umanitoba.ca/~fonsecah/ipa/access
>>>>>
>>>>> Regards, Daryl
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> --
>>> --
>>> Daryl Fonseca-Holt
>>> IST/CNS/Unix Server Team
>>> University of Manitoba
>>> 204.480.1079
>>
>
> --
> --
> Daryl Fonseca-Holt
> IST/CNS/Unix Server Team
> University of Manitoba
> 204.480.1079
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/freeipa-users/attachments/20160316/fe69f097/attachment.htm>
More information about the Freeipa-users
mailing list