[Freeipa-devel] [PATCH] 1031 run cleanallruv task

Rob Crittenden rcritten at redhat.com
Thu Jul 5 18:39:53 UTC 2012


Martin Kosek wrote:
> On 07/03/2012 04:41 PM, Rob Crittenden wrote:
>> Deleting a replica can leave a replication vector (RUV) on the other servers.
>> This can confuse things if the replica is re-added, and it also causes the
>> server to calculate changes against a server that may no longer exist.
>>
>> 389-ds-base provides a new task that self-propogates itself to all available
>> replicas to clean this RUV data.
>>
>> This patch will create this task at deletion time to hopefully clean things up.
>>
>> It isn't perfect. If any replica is down or unavailable at the time the
>> cleanruv task fires, and then comes back up, the old RUV data may be
>> re-propogated around.
>>
>> To make things easier in this case I've added two new commands to
>> ipa-replica-manage. The first lists the replication ids of all the servers we
>> have a RUV for. Using this you can call clean_ruv with the replication id of a
>> server that no longer exists to try the cleanallruv step again.
>>
>> This is quite dangerous though. If you run cleanruv against a replica id that
>> does exist it can cause a loss of data. I believe I've put in enough scary
>> warnings about this.
>>
>> rob
>>
>
> Good work there, this should make cleaning RUVs much easier than with the
> previous version.
>
> This is what I found during review:
>
> 1) list_ruv and clean_ruv command help in man is quite lost. I think it would
> help if we for example have all info for commands indented. This way user could
> simply over-look the new commands in the man page.
>
>
> 2) I would rename new commands to clean-ruv and list-ruv to make them
> consistent with the rest of the commands (re-initialize, force-sync).
>
>
> 3) It would be nice to be able to run clean_ruv command in an unattended way
> (for better testing), i.e. respect --force option as we already do for
> ipa-replica-manage del. This fix would aid test automation in the future.
>
>
> 4) (minor) The new question (and the del too) does not react too well for CTRL+D:
>
> # ipa-replica-manage clean_ruv 3 --force
> Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389
>
> Cleaning the wrong replica ID will cause that server to no
> longer replicate so it may miss updates while the process
> is running. It would need to be re-initialized to maintain
> consistency. Be very careful.
> Continue to clean? [no]: unexpected error:
>
>
> 5) Help for clean_ruv command without a required parameter is quite confusing
> as it reports that command is wrong and not the parameter:
>
> # ipa-replica-manage clean_ruv
> Usage: ipa-replica-manage [options]
>
> ipa-replica-manage: error: must provide a command [clean_ruv | force-sync |
> disconnect | connect | del | re-initialize | list | list_ruv]
>
> It seems you just forgot to specify the error message in the command definition
>
>
> 6) When the remote replica is down, the clean_ruv command fails with an
> unexpected error:
>
> [root at vm-086 ~]# ipa-replica-manage clean_ruv 5
> Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389
>
> Cleaning the wrong replica ID will cause that server to no
> longer replicate so it may miss updates while the process
> is running. It would need to be re-initialized to maintain
> consistency. Be very careful.
> Continue to clean? [no]: y
> unexpected error: {'desc': 'Operations error'}
>
>
> /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors:
> [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed
> to connect to repl        agreement connection
> (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica,
>      cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping
> tree,cn=config), error 105
> [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica
> (cn=meTovm-055.idm.lab.
> bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping
> tree,   cn=config) has not been cleaned.  You will need to rerun the
> CLEANALLRUV task on this replica.
> [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task
> failed (1)
>
> In this case I think we should inform user that the command failed, possibly
> because of disconnected replicas and that they could enable the replicas and
> try again.
>
>
> 7) (minor) "pass" is now redundant in replication.py:
> +        except ldap.INSUFFICIENT_ACCESS:
> +            # We can't make the server we're removing read-only but
> +            # this isn't a show-stopper
> +            root_logger.debug("No permission to switch replica to read-only,
> continuing anyway")
> +            pass
>

I think this addresses everything.

rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: freeipa-rcrit-1031-2-cleanruv.patch
Type: text/x-diff
Size: 11300 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/freeipa-devel/attachments/20120705/21f360fb/attachment.bin>


More information about the Freeipa-devel mailing list