[Freeipa-devel] [PATCH] 1050 prevent replica orphans

Martin Kosek mkosek at redhat.com
Mon Sep 17 14:06:37 UTC 2012


On 09/14/2012 09:16 PM, Rob Crittenden wrote:
> Martin Kosek wrote:
>> On 09/10/2012 08:34 PM, Rob Crittenden wrote:
>>> Martin Kosek wrote:
>>>> On Thu, 2012-09-06 at 17:22 -0400, Rob Crittenden wrote:
>>>>> Martin Kosek wrote:
>>>>>> On 08/31/2012 07:40 PM, Rob Crittenden wrote:
>>>>>>> Rob Crittenden wrote:
>>>>>>>> It was possible use ipa-replica-manage connect/disconnect/del to end up
>>>>>>>> orphaning or or more IPA masters. This is an attempt to catch and
>>>>>>>> prevent that case.
>>>>>>>>
>>>>>>>> I tested with this topology, trying to delete B.
>>>>>>>>
>>>>>>>> A <-> B <-> C
>>>>>>>>
>>>>>>>> I got here by creating B and C from A, connecting B to C then deleting
>>>>>>>> the link from A to B, so it went from A -> B and A -> C to the above.
>>>>>>>>
>>>>>>>> What I do is look up the servers that the delete candidate host has
>>>>>>>> connections to and see if we're the last link.
>>>>>>>>
>>>>>>>> I added an escape clause if there are only two masters.
>>>>>>>>
>>>>>>>> rob
>>>>>>>
>>>>>>> Oh, this relies on my cleanruv patch 1031.
>>>>>>>
>>>>>>> rob
>>>>>>>
>>>>>>
>>>>>> 1) When I run ipa-replica-manage del --force to an already uninstalled host,
>>>>>> the new code will prevent me the deletation because it cannot connect to
>>>>>> it. It
>>>>>> also crashes with UnboundLocalError:
>>>>>>
>>>>>> # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force
>>>>>>
>>>>>> Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing removal
>>>>>> Traceback (most recent call last):
>>>>>>      File "/sbin/ipa-replica-manage", line 708, in <module>
>>>>>>        main()
>>>>>>      File "/sbin/ipa-replica-manage", line 677, in main
>>>>>>        del_master(realm, args[1], options)
>>>>>>      File "/sbin/ipa-replica-manage", line 476, in del_master
>>>>>>        sys.exit("Failed read master data from '%s': %s" % (delrepl.hostname,
>>>>>> str(e)))
>>>>>> UnboundLocalError: local variable 'delrepl' referenced before assignment
>>>>>
>>>>> Fixed.
>>>>>
>>>>>>
>>>>>>
>>>>>> I also hit this error when removing a winsync replica.
>>>>>
>>>>> Fixed.
>>>>>
>>>>>>
>>>>>>
>>>>>> 2) As I wrote before, I think having --force option override the user
>>>>>> inquiries
>>>>>> would benefit test automation:
>>>>>>
>>>>>> +            if not ipautil.user_input("Continue to delete?", False):
>>>>>> +                sys.exit("Aborted")
>>>>>
>>>>> Fixed.
>>>>>
>>>>>>
>>>>>>
>>>>>> 3) I don't think this code won't cover this topology:
>>>>>>
>>>>>> A - B - C - D - E
>>>>>>
>>>>>> It would allow you deleting a replica C even though it would separate A-B
>>>>>> and
>>>>>> D-E. Though we may not want to cover this situation now, what you got is
>>>>>> definitely helping.
>>>>>
>>>>> I think you may be right. I only tested with 4 servers. With this B and
>>>>> D would both still have 2 agreements so wouldn't be covered by the last
>>>>> link test.
>>>>
>>>> Everything looks good now, so ACK. We just need to push it along with
>>>> CLEANALLRUV patch.
>>>>
>>>> Martin
>>>>
>>>
>>> Not to look a gift ACK In the mouth but here is a revised patch. I've added a
>>> cleanup routine to remove an orphaned master properly. If you had tried the
>>> mechanism I outlined in the man page it would have worked but was
>>> less-than-complete. This way is better, just don't try it on a live master.
>>>
>>> I also added a cleanruv abort command, in case you want to kill an existing
>>> task.
>>>
>>> rob
>>>
>>
>> 1) CLEANRUV stuff should be in your patch 1031 and not here (but I will comment
>> on the code in this mail since it is in this patch)
>>
>>
>> 2) In new command defitinion:
>>
>> +    "abort-clean-ruv":(1, 1, "Replica ID of to abort cleaning", ""),
>>
>> I miss error message in case REPLICA_ID is not passed, this way error message
>> when I omit the parameter is confusing:
>>
>> # ipa-replica-manage abort-clean-ruv
>> Usage: ipa-replica-manage [options]
>>
>> ipa-replica-manage: error: must provide a command [force-sync | clean-ruv |
>> disconnect | connect | list-ruv | del | re-initialize | list | abort-clean-ruv]
>>
>> On another note, I am thinking about the new command(s). Since we now have
>> abort-clean-ruv command, we may want to also implement list-clean-ruv commands
>> in the future to see all CLEANALLRUV commands in process... Or we may enhance
>> list-ruv to write a flag like [CLEANALLRUV in process] for RUV's for which
>> CLEANALLRUV task is in process.
>>
>>
>> 3) Will clean-ruv - abort-clean-ruv - clean-ruv sequence work? If the aborted
>> CLEANALLRUV task stays in DIT, we may not be able to enter new CLEANALLRUV task
>> because we always use the same DN...
>>
>> Btw. did abort CLEANALLRUV command worked for you? Mine seemed to be stuck on
>> replicas that are down just like CLEANALLRUV command. I had then paralelly
>> running CLEANALLRUV and ABORT CLEANALLRUV running for the same RUV ID. Then, it
>> is unclear to me what this command is actually good for.
>>
>>
>> 4) Man page now contains non-existent command:
>>
>> --- a/install/tools/man/ipa-replica-manage.1
>> +++ b/install/tools/man/ipa-replica-manage.1
>> @@ -42,12 +42,18 @@ Manages the replication agreements of an IPA server.
>>   \fBforce\-sync\fR
>>   \- Immediately flush any data to be replicated from a server specified with
>> the \-\-from option
>>   .TP
>> +\fBcleanup\fR
>> +\- Remove leftover references to a deleted master.
>> +.TP
>>
>>
>> The cleanup procedure was implemented rather as an option to the del command
>> than a separate command.
>>
>>
>> 5) My favorite - new cleanup procedure user prompt miss the --force option
>> useful for test automation
>>
>> +            if not ipautil.user_input("Continue to clean master?", False):
>> +                sys.exit("Cleanup aborted")
>>
>>
>> Otherwise the patch looks good.
>>
>> Martin
>>
> 
> I pulled the abort-ruv stuff out. It was just easier to stuff it in here than
> rebasing back, but yeah, its cleaner this way.
> 
> No need to check force for clean because it is already required (can't contact
> the deleted master, it's gone).
> 
> rob

The patch works fine now. So ACK when the dependent patch 1031 is pushed.

Martin




More information about the Freeipa-devel mailing list