[Freeipa-users] stubborn old replicas

thierry bordaz tbordaz at redhat.com
Thu Aug 27 08:05:34 UTC 2015


On 08/27/2015 09:41 AM, Ludwig Krispenz wrote:
>
> On 08/27/2015 09:08 AM, Martin Kosek wrote:
>> On 08/26/2015 05:31 PM, Simo Sorce wrote:
>>> On Wed, 2015-08-26 at 06:36 -0700, Janelle wrote:
>>>> Hello all,
>>>>
>>>> My biggest problem is losing replicas and then trying to delete the
>>>> entries and rebuild them. Here is a perfect example, I simply can't 
>>>> get
>>>> rid of these  (see below). I have tried (of course after the ORIGINAL
>>>> "ipa-replica-manage del hostname --force --clean":
>>>>
>>>> ipa-replica-manage clean-ruv 25
>>>>
>>>> ldapmodify... with:
>>>>     dn: cn=clean 25, cn=cleanallruv, cn=tasks, cn=config
>>>>     objectclass: extensibleObject
>>>>     replica-base-dn: dc=example,dc=com
>>>>     replica-id: 25
>>>>     cn: clean 25
>>>>
>>>> And yet nothing works. Any suggestions? This is perhaps the most
>>>> frustrating part about maintaining IPA.
>>>>
>>>> ~J
>>>>
>>>> unable to decode: {replica 12} 5588dc2e0000000c0000 
>>>> 559f3de60004000c0000
>>>> unable to decode: {replica 14} 5587aa8d0000000e0000 
>>>> 5587aa8d0003000e0000
>>>> unable to decode: {replica 16} 5588f58f000000100000 
>>>> 55bb7b08000500100000
>>>> unable to decode: {replica 25} 55a4887b000000190000 
>>>> 55a49242000400190000
>>>> unable to decode: {replica 29} 55d199a50001001d0000 
>>>> 55d199a50001001d0000
>>>> unable to decode: {replica 3} 5587c5c3000000030000 
>>>> 55b8a049000100030000
>>>> unable to decode: {replica 5} 55cc82ab041d00050000 
>>>> 55cc82ab041d00050000
>>> Have you tried restarting DS before trying to clean the ruv ?
>>>
>>> I run in a similar problem in a test install recently, and I got better
>>> results that way. The bug is known to the DS people and they are 
>>> working
>>> to get out patches that fix the root issue.
>>>
>>> Simo.
>> CCing DS folks. Wasn't there a recent DS fix that was supposed to 
>> improve the
>> RUV situation?
>>
>> Looking at 389 DS Trac, I see some interesting RUV fixes in 1.3.4.x 
>> releases:
>>
>> https://fedorahosted.org/389/query?summary=~RUV&status=closed&order=milestone&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=milestone 
>>
>>
>> I see that 389-ds-base-1.3.4.3 is already in Fedora 22+, does the RUV 
>> issue
>> happen there?
> it should not, and I think Thierry verified the fix.
> The problem we resolved and which we think is the core of the 
> corrupted RUV was that the cleanallruv task did only purge the RUV, 
> but dit not purge the changelog. If cleanallruv was run and the server 
> had a disorderly shutdown (crash or abort when shutdown was hanging) 
> then at restart the changelog RUV was rebuilt from the data in the 
> changelog and if it contained a csn from cleaned RIDs this was added 
> to the RUV (but the reference to the server was lost and so the url 
> part is missing from this RUV.
> The fix now does remove all references to the cleaned RID from the 
> changelog and the problem should not reoccur with RIDs cleaned with 
> the fix, of course th echangelog can still can contain references to 
> RIDs cleaned before the fix - and if no changelog trimming is 
> configured this is what will happen. So, even after the fix old RUVs 
> could pop up and have to be (finally) cleaned.
>
> The other source is that these corrupted rivs can be "imported" from 
> another server by exchanging ruvs in the repl protocol. Cleanallruv 
> tries to address this and to propagate the cleanallruv tasks to all 
> servers it thinks are connected. If there are replication agreements 
> to servers which no longer exist or to servers which cannot be 
> connetcted this will delay the ruv cleaning
>

Hello,

I verified the fix in 1.3.4.2 F22 / 389-ds-base-1.3.4.0-6.el7 RHEL7, so 
after those versions CLEANALLRUV do not create any longer corrupted ruv 
elements.
According to the timestamp in the ruv (for example csn2date.py 
5587aa8d0003000e0000 --> 22/06/2015:06:26:21) this are old ruv elements. 
I think Ludwig is right, these corrupted ruv-elements come from old 
cleanallruv before the fix was applied.

The problem is that even a fixed server can get those corrupted 
ruv-elements from others servers.
All servers in the topology should be updated with that fix, so that at 
least they stop creating corrupted ruv-elements.
Now to get rid of the existing ones, I imagine only brute option of 
recreating replica and reinit... I hope an other option is possible.

thanks
thierry


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/freeipa-users/attachments/20150827/76aceef9/attachment.htm>


More information about the Freeipa-users mailing list