[Freeipa-users] ipa replica failure

Andrew E. Bruno aebruno2 at buffalo.edu
Fri Jun 19 19:57:38 UTC 2015


On Fri, Jun 19, 2015 at 03:18:50PM -0400, Rob Crittenden wrote:
> Rich Megginson wrote:
> >On 06/19/2015 12:22 PM, Andrew E. Bruno wrote:
> >>
> >>Questions:
> >>
> >>0. Is it likely that after running out of file descriptors the dirsrv
> >>slapd database on rep2 was corrupted?
> >
> >That would appear to be the case based on correlation of events,
> >although I've never seen that happen, and it is not supposed to happen.
> >
> >>
> >>1. Do we have to run ipa-replica-manage del rep2 on *each* of the
> >>remaining replica servers (rep1 and rep3)? Or should it just be run on
> >>the first master?
> >
> >I believe it should only be run on the first master, but it hung, so
> >something is not right, and I'm not sure how to remedy the situation.
> 
> How long did it hang, and where?

This command was run on rep1 (first master):

[rep1]$ ipa-replica-manage del rep2 

This command hung.. (~10 minutes..) until I Ctr-C. After noticing ldap
queries were hanging on rep2 we ran this on rep2:

[rep2]$ systemctl stop ipa
(shutdown all ipa services on rep2)

Then back on rep1 (first master)

[rep1]$ ipa-replica-manage -v --force del rep2

Which appeared to work ok.

> 
> >>Do we need to run ipa-csreplicate-manage del as well?
> >>
> >>2. Why does the rep2 server still appear when querying the
> >>nsDS5ReplicationAgreement in ldap? Is this benign or will this pose
> >>problems
> >>when we go to add rep2 back in?
> >
> >You should remove it.
> 
> And ipa-csreplica-manage is the tool to do it.

When I run this on rep1 (first master):

[rep1]$ ipa-csreplica-manage list
Directory Manager password: 

rep3: master
rep1: master


[rep1]$ ipa-csreplica-manage del rep2
Directory Manager password: 

'rep1' has no replication agreement for 'rep2'

But seems to still be there:

[rep1]$ ldapsearch -Y GSSAPI -b "cn=mapping tree,cn=config" objectClass=nsDS5ReplicationAgreement -LL

dn: cn=masterAgreement1-rep3-pki-tomcat,cn=replica,cn=ipaca,cn=mapping tree,cn=config
objectClass: top
objectClass: nsds5replicationagreement
cn: masterAgreement1-rep3-pki-tomcat
nsDS5ReplicaRoot: o=ipaca
nsDS5ReplicaHost: rep3
nsDS5ReplicaPort: 389
nsDS5ReplicaBindDN: cn=Replication Manager cloneAgreement1-rep3-pki-tomcat,ou=csusers,cn=config
nsDS5ReplicaBindMethod: Simple
nsDS5ReplicaTransportInfo: TLS
description: masterAgreement1-rep3-pki-tomcat
nsds50ruv: {replicageneration} 5527f74b000000600000
nsds50ruv: {replica 91 ldap://rep3:389} 5537c7ba0000005b
 0000 5582c7e40004005b0000
nsds50ruv: {replica 96 ldap://rep1:389} 5527f75400000060
 0000 5582cd19000000600000
nsds50ruv: {replica 97 ldap://rep2:389} 5527f76000000061
 0000 556f462b000400610000
nsruvReplicaLastModified: {replica 91 ldap://rep3:389} 0
 0000000
nsruvReplicaLastModified: {replica 96 ldap://rep1:389} 0
 0000000
nsruvReplicaLastModified: {replica 97 ldap://rep2:389} 0
 0000000
nsds5replicaLastUpdateStart: 20150619193149Z
nsds5replicaLastUpdateEnd: 20150619193149Z
nsds5replicaChangesSentSinceStartup:: OTY6MTMyLzAg
nsds5replicaLastUpdateStatus: 0 Replica acquired successfully: Incremental upd
 ate succeeded
nsds5replicaUpdateInProgress: FALSE
nsds5replicaLastInitStart: 0
nsds5replicaLastInitEnd: 0


However, when I run the ldapsearch on rep3 it's not there (the
cn=ipaca,cn=mapping tree,cn=config is not listed):

[rep3]$ ldapsearch -Y GSSAPI -b "cn=mapping tree,cn=config" objectClass=nsDS5ReplicationAgreement -LL

dn: cn=meTorep1,cn=replica,cn=dc\3Dccr\2Cdc\3Dbuffalo\2C dc\3Dedu,cn=mapping tree,cn=config
cn: meTorep1
objectClass: nsds5replicationagreement
objectClass: top
nsDS5ReplicaTransportInfo: LDAP
description: me to rep1
nsDS5ReplicaRoot: dc=ccr,dc=buffalo,dc=edu
nsDS5ReplicaHost: rep1


> 
> >>
> >>3. What steps/commands can we take to verify rep2 was successfully
> >>removed and
> >>replication is behaving normally?
> 
> The ldapsearch you performed already will confirm that the CA agreement has
> been removed.

Still showing up.. Any thoughts? 

At this point we want to ensure both remaining masters are functional and
operating normally. Any other commands you recommend running to check? 

> >
> >8192 is extremely high.  The fact that you ran out of file descriptors
> >at 8192 seems like a bug/fd leak somewhere.  I suppose you could, as a
> >very temporary workaround, set the fd limit higher, but that is no
> >guarantee that you won't run out again.
> >
> >Please file at least 1 ticket e.g. "database corrupted when server ran
> >out of file descriptors", with as much information about that particular
> >problem as you can provide.
> >

Will do.

Thanks very much for all the help!

--Andrew




More information about the Freeipa-users mailing list