[Freeipa-users] Attempting to re-provision previous replica

Rob Crittenden rcritten at redhat.com
Mon Nov 24 16:01:55 UTC 2014


John Desantis wrote:
> Hello again,
> 
> I was just wondering if there was an update on this thread?
> 
> Since it is just one machine having an issue, do you (Rob and Rich)
> think a re-initialization from the master on the affected host would
> clear the clog?  I have left it alone since Mark was brought into the
> discussion.

A re-init won't help because the RUVs are stored outside of the
replicated data.

rob

> 
> Thank you!
> John DeSantis
> 
> 2014-10-23 9:34 GMT-04:00 Rich Megginson <rmeggins at redhat.com>:
>> On 10/23/2014 07:01 AM, John Desantis wrote:
>>>
>>> Rob and Rich,
>>>
>>>>> ipa-replica-manage del should have cleaned things up. You can clear out
>>>>> old RUVs with ipa-replica-manage too via list-ruv and clean-ruv. You use
>>>>> list-ruv to get the id# to clean and clean-ruv to do the actual
>>>>> cleaning.
>>>>
>>>> I remember having previously tried this task, but it had failed on
>>>> older RUV's which were not even active (the KDC was under some strain
>>>> so ipa queries were timing out).  However, I ran it again and have
>>>> been able to delete all but 1 (it's still running) RUV referencing the
>>>> previous replica.
>>>>
>>>> I'll report back once the tasks finishes or fails.
>>>
>>> The last RUV is "stuck" on another replica.  It fails with the following
>>> error:
>>>
>>> [23/Oct/2014:08:55:09 -0400] NSMMReplicationPlugin - CleanAllRUV Task:
>>> Initiating CleanAllRUV Task...
>>> [23/Oct/2014:08:55:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task:
>>> Retrieving maxcsn...
>>> [23/Oct/2014:08:55:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task:
>>> Found maxcsn (5447f861000000180000)
>>> [23/Oct/2014:08:55:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task:
>>> Cleaning rid (24)...
>>> [23/Oct/2014:08:55:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task:
>>> Waiting to process all the updates from the deleted replica...
>>> [23/Oct/2014:08:55:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task:
>>> Waiting for all the replicas to be online...
>>> [23/Oct/2014:08:55:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task:
>>> Waiting for all the replicas to receive all the deleted replica
>>> updates...
>>> [23/Oct/2014:08:55:11 -0400] NSMMReplicationPlugin - CleanAllRUV Task:
>>> Replica maxcsn (5447f56b000200180000) is not caught up with deleted
>>> replica's maxcsn(5447f861000000180000)
>>> [23/Oct/2014:08:55:11 -0400] NSMMReplicationPlugin - CleanAllRUV Task:
>>> Replica not caught up (agmt="cn=meToiparepbackup.our.personal.domain"
>>> (iparepbackup:389))
>>> [23/Oct/2014:08:55:11 -0400] NSMMReplicationPlugin - CleanAllRUV Task:
>>> Not all replicas caught up, retrying in 10 seconds
>>> [23/Oct/2014:08:55:23 -0400] NSMMReplicationPlugin - CleanAllRUV Task:
>>> Replica maxcsn (5447f56b000200180000) is not caught up with deleted
>>> replica's maxcsn(5447f861000000180000)
>>> [23/Oct/2014:08:55:23 -0400] NSMMReplicationPlugin - CleanAllRUV Task:
>>> Replica not caught up (agmt="cn=meToiparepbackup.our.personal.domain"
>>> (iparepbackup:389))
>>> [23/Oct/2014:08:55:23 -0400] NSMMReplicationPlugin - CleanAllRUV Task:
>>> Not all replicas caught up, retrying in 20 seconds
>>>
>>> I then abort the task since the retrying went up to 14400 seconds.
>>
>>
>> Mark, do you know what is going on here?
>>
>>
>>>
>>> Would this be a simple re-initialization from the master on the host
>>> "iparepbackup"?
>>>
>>> Thanks,
>>> John DeSantis
>>>
>>> 2014-10-22 16:03 GMT-04:00 John Desantis <desantis at mail.usf.edu>:
>>>>
>>>> Rob and Rich,
>>>>
>>>>> ipa-replica-manage del should have cleaned things up. You can clear out
>>>>> old RUVs with ipa-replica-manage too via list-ruv and clean-ruv. You use
>>>>> list-ruv to get the id# to clean and clean-ruv to do the actual
>>>>> cleaning.
>>>>
>>>> I remember having previously tried this task, but it had failed on
>>>> older RUV's which were not even active (the KDC was under some strain
>>>> so ipa queries were timing out).  However, I ran it again and have
>>>> been able to delete all but 1 (it's still running) RUV referencing the
>>>> previous replica.
>>>>
>>>> I'll report back once the tasks finishes or fails.
>>>>
>>>> Thanks,
>>>> John DeSantis
>>>>
>>>>
>>>> 2014-10-22 15:49 GMT-04:00 Rob Crittenden <rcritten at redhat.com>:
>>>>>
>>>>> Rich Megginson wrote:
>>>>>>
>>>>>> On 10/22/2014 12:55 PM, John Desantis wrote:
>>>>>>>
>>>>>>> Richard,
>>>>>>>
>>>>>>>> You should remove the unused ruv elements.  I'm not sure why they
>>>>>>>> were not
>>>>>>>> cleaned.  You may have to use cleanallruv manually.
>>>>>>>>
>>>>>>>> https://access.redhat.com/documentation/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/Managing_Replication-Solving_Common_Replication_Conflicts.html#cleanruv
>>>>>>>>
>>>>>>>>
>>>>>>>> note - use the cleanallruv procedure, not cleanruv.
>>>>>>>
>>>>>>> I'll try that, thanks for the guidance.
>>>>>>>
>>>>>>>> What is the real problem you have?  Did replication stop working? Are
>>>>>>>> you
>>>>>>>> getting error messages?
>>>>>>>
>>>>>>> I cannot get the host to be a replica.  Each time I run
>>>>>>> `ipa-replica-install
>>>>>>> replica-info-host-in-question.our.personal.domain.gpg' it fails.  I
>>>>>>> had assumed it was due to the fact that the host was already a
>>>>>>> replica, but had to be taken offline due to a hard disk failing.  The
>>>>>>> machine was re-provisioned after the new hard drive was installed.
>>>>>>
>>>>>> Ok.  I don't know if we have a documented procedure for that case. I
>>>>>> assumed that if you first ran ipa-replica-manage del, then
>>>>>> ipa-replica-prepare, then ipa-replica-install, that would take care of
>>>>>> that.
>>>>>
>>>>> ipa-replica-manage del should have cleaned things up. You can clear out
>>>>> old RUVs with ipa-replica-manage too via list-ruv and clean-ruv. You use
>>>>> list-ruv to get the id# to clean and clean-ruv to do the actual
>>>>> cleaning.
>>>>>
>>>>>>> When I enabled extra debugging during the installation process, the
>>>>>>> initial error was that the dirsrv instance couldn't be started.  I
>>>>>>> checked into this and found that there were missing files in
>>>>>>> /etc/dirsrv/slapd-BLAH directory.  I was then able to start dirsrv
>>>>>>> after copying some schema files from another replica.  The install did
>>>>>>> move forward but then failed with Apache and its IPA configuration.
>>>>>>>
>>>>>>> I performed several uninstalls and re-installs, and at one point I got
>>>>>>> error code 3 from ipa-replica-install, which is why I was thinking
>>>>>>> that the old RUV's and tombstones were to blame.
>>>>>>
>>>>>> It could be.  I'm really not sure what the problem is at this point.
>>>>>
>>>>> I think we'd need to see ipareplica-install.log to know for sure. It
>>>>> could be the sort of thing where it fails early but doesn't kill the
>>>>> install, so the last error is a red herring.
>>>>>
>>>>> rob
>>>>>
>>>>>>> Thanks,
>>>>>>> John DeSantis
>>>>>>>
>>>>>>>
>>>>>>> 2014-10-22 12:51 GMT-04:00 Rich Megginson <rmeggins at redhat.com>:
>>>>>>>>
>>>>>>>> On 10/22/2014 10:31 AM, John Desantis wrote:
>>>>>>>>>
>>>>>>>>> Richard,
>>>>>>>>>
>>>>>>>>> You helped me before in #freeipa, so I appreciate the assistance
>>>>>>>>> again.
>>>>>>>>>
>>>>>>>>>> What version of 389 are you using?
>>>>>>>>>> rpm -q 389-ds-base
>>>>>>>>>
>>>>>>>>> 389-ds-base-1.2.11.15-34.el6_5
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> John DeSantis
>>>>>>>>>
>>>>>>>>> 2014-10-22 12:09 GMT-04:00 Rich Megginson <rmeggins at redhat.com>:
>>>>>>>>>>
>>>>>>>>>> On 10/22/2014 09:42 AM, John Desantis wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hello all,
>>>>>>>>>>>
>>>>>>>>>>> First and foremost, a big "thank you!" to the FreeIPA developers
>>>>>>>>>>> for a
>>>>>>>>>>> great product!
>>>>>>>>>>>
>>>>>>>>>>> Now, to the point!
>>>>>>>>>>>
>>>>>>>>>>> We're trying to re-provision a previous replica using the standard
>>>>>>>>>>> documentation via the Red Hat site:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Identity_Management_Guide/Setting_up_IPA_Replicas.html
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> However, we're running into errors during the import process.  The
>>>>>>>>>>> errors are varied and fail at random steps; there was an issue
>>>>>>>>>>> with
>>>>>>>>>>> NTP or HTTP or LDAP, etc.  This did not happen when we promoted a
>>>>>>>>>>> separate node to become a replica.
>>>>>>>>>>>
>>>>>>>>>>> We had previously removed the replica via `ipa-replica-manage del`
>>>>>>>>>>> and
>>>>>>>>>>> ensured that no trace of it being a replica existed: removed DNS
>>>>>>>>>>> records and verified that the host enrollment was not present.  I
>>>>>>>>>>> did
>>>>>>>>>>> not use the "--force" and "--cleanup" options.
>>>>>>>>>>
>>>>>>>>>> What version of 389 are you using?
>>>>>>>>>> rpm -q 389-ds-base
>>>>>>>>
>>>>>>>> You should remove the unused ruv elements.  I'm not sure why they
>>>>>>>> were not
>>>>>>>> cleaned.  You may have to use cleanallruv manually.
>>>>>>>>
>>>>>>>> https://access.redhat.com/documentation/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/Managing_Replication-Solving_Common_Replication_Conflicts.html#cleanruv
>>>>>>>>
>>>>>>>>
>>>>>>>> note - use the cleanallruv procedure, not cleanruv.
>>>>>>>>
>>>>>>>>>>> When I check RUV's against the host in question, there are
>>>>>>>>>>> several.  I
>>>>>>>>>>> also queried the tombstones against the host and found two entries
>>>>>>>>>>> which have valid hex time stamps;  coincidentally, out of the 9
>>>>>>>>>>> tombstone entries, 2 have "nsds50ruv" time stamps.  I'll paste
>>>>>>>>>>> sanitized output below:
>>>>>>>>>>>
>>>>>>>>>>> # ldapsaerch -x -W -LLL -D "cn=directory manager" -b
>>>>>>>>>>> "dc=our,dc=personal,dc=domain" '(objectclass=nsTombstone)'
>>>>>>>>>>> Enter LDAP Password:
>>>>>>>>>>> dn:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=our,dc=personal,dc=domain
>>>>>>>>>>>
>>>>>>>>>>> objectClass: top
>>>>>>>>>>> objectClass: nsTombstone
>>>>>>>>>>> objectClass: extensibleobject
>>>>>>>>>>> nsds50ruv: {replicageneration} 50ef13ae000000040000
>>>>>>>>>>> nsds50ruv: {replica 4 ldap://master.our.personal.domain:389}
>>>>>>>>>>> 5164d147000000040000 5447bda 8000100040000
>>>>>>>>>>> nsds50ruv: {replica 22
>>>>>>>>>>> ldap://separatenode.our.personal.domain:389}
>>>>>>>>>>> 54107f9f000000160000 54436b 25000000160000
>>>>>>>>>>> nsds50ruv: {replica 21
>>>>>>>>>>> ldap://iparepbackup.our.personal.domain:389}
>>>>>>>>>>> 51b734de000000150000 51b7 34ef000200150000
>>>>>>>>>>> nsds50ruv: {replica 19
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389}
>>>>>>>>>>> 510d56c9000100130000
>>>>>>>>>>> 510d82 be000200130000
>>>>>>>>>>> nsds50ruv: {replica 18
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389}
>>>>>>>>>>> nsds50ruv: {replica 17
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389}
>>>>>>>>>>> nsds50ruv: {replica 16
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389}
>>>>>>>>>>> nsds50ruv: {replica 15
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389}
>>>>>>>>>>> nsds50ruv: {replica 14
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389}
>>>>>>>>>>> nsds50ruv: {replica 13
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389}
>>>>>>>>>>> nsds50ruv: {replica 12
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389}
>>>>>>>>>>> nsds50ruv: {replica 23
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389}
>>>>>>>>>>> 54187702000200170000
>>>>>>>>>>> 541878 9a000000170000
>>>>>>>>>>> dc: our
>>>>>>>>>>> nsruvReplicaLastModified: {replica 4
>>>>>>>>>>> ldap://master.our.personal.domain:389} 5447bce8
>>>>>>>>>>> nsruvReplicaLastModified: {replica 22
>>>>>>>>>>> ldap://separatenode.our.personal.domain:389} 54436a5e
>>>>>>>>>>> nsruvReplicaLastModified: {replica 21
>>>>>>>>>>> ldap://iparepbackup.our.personal.domain:389} 00000000
>>>>>>>>>>> nsruvReplicaLastModified: {replica 19
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000
>>>>>>>>>>> nsruvReplicaLastModified: {replica 18
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000
>>>>>>>>>>> nsruvReplicaLastModified: {replica 17
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000
>>>>>>>>>>> nsruvReplicaLastModified: {replica 16
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000
>>>>>>>>>>> nsruvReplicaLastModified: {replica 15
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000
>>>>>>>>>>> nsruvReplicaLastModified: {replica 14
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000
>>>>>>>>>>> nsruvReplicaLastModified: {replica 13
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000
>>>>>>>>>>> nsruvReplicaLastModified: {replica 12
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000
>>>>>>>>>>> nsruvReplicaLastModified: {replica 23
>>>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000
>>>>>>>>>>>
>>>>>>>>>>> dn:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> nsuniqueid=c08a2803-5b5a11e2-a527ce8b-8fa47d35,cn=host-in-question.our.personal.domain,cn=maste
>>>>>>>>>>>
>>>>>>>>>>>      rs,cn=ipa,cn=etc,dc=our,dc=personal,dc=domain
>>>>>>>>>>> objectClass: top
>>>>>>>>>>> objectClass: nsContainer
>>>>>>>>>>> objectClass: nsTombstone
>>>>>>>>>>> cn: host-in-question.our.personal.domain
>>>>>>>>>>> nsParentUniqueId: e6fa9418-5b5711e2-a1a9825b-daf5b5b0
>>>>>>>>>>>
>>>>>>>>>>> dn:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> nsuniqueid=664c4383-6d6311e2-8db6e946-de27dd8d,cn=host-in-question.our.personal.domain,cn=maste
>>>>>>>>>>>
>>>>>>>>>>>      rs,cn=ipa,cn=etc,dc=our,dc=personal,dc=domain
>>>>>>>>>>> objectClass: top
>>>>>>>>>>> objectClass: nsContainer
>>>>>>>>>>> objectClass: nsTombstone
>>>>>>>>>>> cn: host-in-question.our.personal.domain
>>>>>>>>>>> nsParentUniqueId: e6fa9418-5b5711e2-a1a9825b-daf5b5b0
>>>>>>>>>>>
>>>>>>>>>>> As you can see, the "host-in-question" has many RUV's and of which
>>>>>>>>>>> two
>>>>>>>>>>> appear to be "active" and two entries which I believe (pardon my
>>>>>>>>>>> ignorance) possibly correlate with the "active" entries of the
>>>>>>>>>>> "host-in-question".
>>>>>>>>>>>
>>>>>>>>>>> Do these two tombstone entries need to be deleted with ldapdelete
>>>>>>>>>>> before we can re-provision "host-in-question" and add it back as a
>>>>>>>>>>> replica?
>>>>>>>>
>>>>>>>> No, you cannot delete tombstones manually.  They will be cleaned up
>>>>>>>> at some
>>>>>>>> point by the dirsrv tombstone reap thread, and they should not be
>>>>>>>> interfering with anything.
>>>>>>>>
>>>>>>>> What is the real problem you have?  Did replication stop working? Are
>>>>>>>> you
>>>>>>>> getting error messages?
>>>>>>>>
>>>>>>>>>>> Thank you,
>>>>>>>>>>> John DeSantis
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Manage your subscription for the Freeipa-users mailing list:
>>>>>>>>>> https://www.redhat.com/mailman/listinfo/freeipa-users
>>>>>>>>>> Go To http://freeipa.org for more info on the project
>>>>>>>>
>>>>>>>> --
>>>>>>>> Manage your subscription for the Freeipa-users mailing list:
>>>>>>>> https://www.redhat.com/mailman/listinfo/freeipa-users
>>>>>>>> Go To http://freeipa.org for more info on the project
>>
>>




More information about the Freeipa-users mailing list