[Freeipa-devel] [PATCH] WIP backup and restore

Rob Crittenden rcritten at redhat.com
Tue Apr 9 21:21:10 UTC 2013


Petr Viktorin wrote:
> On 04/05/2013 10:54 PM, Rob Crittenden wrote:
>> Petr Viktorin wrote:
>>> On 04/04/2013 03:04 PM, Rob Crittenden wrote:
>>>> Rob Crittenden wrote:
>>>>> Petr Viktorin wrote:
>>>>>> On 03/23/2013 05:06 AM, Rob Crittenden wrote:
>>>>>>> There are strict limits on what can be restored where. Only exact
>>>>>>> matching hostnames and versions are allowed right now. We can
>>>>>>> probably
>>>>>>> relax the hostname requirement if we're only restoring data, and the
>>>>>>> version perhaps for only the the first two values (so you can
>>>>>>> restore a
>>>>>>> 3.0.0 backup on 3.0.1 but not on 3.1.0).
>>>>>>
>>>>>> Do we also need to limit the versions of Dogtag, 389, Kerberos...?
>>>>>> Or is what they put in /var/lib guaranteed portable across versions?
>>>>>
>>>>> An interesting question. I'd imagine that a major db change would also
>>>>> require a major update to IPA, therefore if we restrict by IPA version
>>>>> we should be ok. I AM doing an untar of files though so yeah, there
>>>>> is a
>>>>> risk here.
>>>>>
>>>>>>
>>>>>>
>>>>>> That's good to hear!
>>>>>>
>>>>>> I think while developing, you should run with -v to get all the
>>>>>> output.
>>>>>> And for production, please have the commands log by default (set
>>>>>> log_file_name).
>>>>>
>>>>> Yes, I plan on adding that in the future.
>>>>>
>>>>>>
>>>>>>> ipa-backup does all its binds using ldapi. ipa-restore needs the DM
>>>>>>> password because we need to contact remote servers to enable/disable
>>>>>>> replication.
>>>>>>
>>>>>> The restore assumes that ipa is already installed, right? I can't
>>>>>> just
>>>>>> run it on a clean machine? Id would be good to check/document this.
>>>>>
>>>>> It depends.
>>>>>
>>>>> For a full backup you can actually restore to an uninstalled
>>>>> server. In
>>>>> fact, you could restore to a machine w/no IPA bits on it at all in all
>>>>> likelihood (which wouldn't be very nice, but not exactly wrong either
>>>>> IMHO).
>>>>>
>>>>> I tested this with:
>>>>>   - ipa-server-install
>>>>>   - ipa-backup
>>>>>   - ipa-server-install --uninstall -U
>>>>>   - ipa-restore
>>>>>   - ran the unit tests
>>>>>
>>>>>> I looked in the backup tarball and saw dirsrv PID file and lock
>>>>>> directories. Are these needed?
>>>>>
>>>>> Probably not. I gathered some of the files to backup based on watching
>>>>> what files that changed during an install that are independent of us.
>>>>> I'll take another pass through it, there may be other oddities too.
>>>>>
>>>>>> The tool runners (install/tools/ipa-backup and
>>>>>> install/tools/ipa-restore) are missing, so IPA doesn't build.
>>>>>> Probably
>>>>>> just a forgotten git add.
>>>>>
>>>>> Yup.
>>>>>
>>>>>>
>>>>>> The patch adds an extra backslash in install/tools/man/Makefile.am;
>>>>>> consider adding $(NULL) at the end.
>>>>>
>>>>> I'll take a look.
>>>>>
>>>>>>
>>>>>> Backup.dirs, Backup.files, Backup.logs:
>>>>>> The code modifies these class-level attributes. This means you can't
>>>>>> run
>>>>>> the command more than once in one Python session, so it can't be
>>>>>> used as
>>>>>> a library (e.g. in a hypothetical test suite).
>>>>>> Please make the class atrributes immutable (tuples), then in
>>>>>> __init__ do
>>>>>> `self.dirs = list(self.dirs)` to to get instance-specific lists.
>>>>>
>>>>> Ok, fair point.
>>>>>
>>>>>> Code that creates temporary files/directories or does chdir()
>>>>>> should be
>>>>>> next to (or in) a try block whose finally: restores things.
>>>>>
>>>>> Yes, I mean to add a big try/finally block around everything in run
>>>>> eventually (or the equivalent anyway).
>>>>>
>>>>>>
>>>>>> Instead of star imports (from ipapython.ipa_log_manager import *),
>>>>>> please explicitly import whatever you're using from that package. In
>>>>>> this case, nothing at all!
>>>>>
>>>>> Yeah, I haven't run this through pylint yet to see all the bad
>>>>> imports I
>>>>> cobbled together.
>>>>>
>>>>>> If there's a get_connection() method, it should return the
>>>>>> connection,
>>>>>> and it should always be used to get it. Store the connection in
>>>>>> self._conn, and rather than:
>>>>>>      self.get_connection()
>>>>>>      self.conn.do_work()
>>>>>> do:
>>>>>>      conn = self.get_connection()
>>>>>>      conn.do_work()
>>>>>> This makes forgetting to call get_connection() impossible.
>>>>>
>>>>> My purpose was to avoid creating multiple connections.
>>>>>
>>>>>> If a variable is only used in a single method, like `filename` in
>>>>>> Restore.extract_backup, don't store it in the admintool object.
>>>>>> In general, try to avoid putting data in self if possible. It's
>>>>>> convenient but it allows complex interdependencies.
>>>>>> (Yes, I'm guilty of not following this myself.)
>>>>>
>>>>> Yes, there is certainly a bit of cleanup to do. I was in a bit of a
>>>>> rush
>>>>> to get this WIP out.
>>>>>
>>>>>> In several places, the backup tool logs critical errors and then just
>>>>>> continues on. Is that by design? I think a nonzero exit code would be
>>>>>> appropriate.
>>>>>
>>>>> I'll take a look at them, all I can say at this point is maybe.
>>>>>
>>>>>> In the ipa-restore man page, --backend is not documented.
>>>>>>
>>>>>> In db2ldif, db2bak, etc., a more conscise way to get the time
>>>>>> string is
>>>>>> `time.strftime('export_%Y_%m_%d_%H_%M_%S')`.
>>>>>>
>>>>>> When validating --gpg-keyring, it would be good to check both files
>>>>>> (.sec and .pub).
>>>>>
>>>>> Ok, I can do those.
>>>>>
>>>>>>
>>>>>> Here:
>>>>>>      if (self.backup_type != 'FULL' and not options.data_only and
>>>>>>          not instances):
>>>>>>          raise admintool.ScriptError('Cannot restore a data backup
>>>>>> into
>>>>>> an empty system')
>>>>>> did you mean `(self.backup_type != 'FULL' or options.data_only) and
>>>>>> not
>>>>>> instances`? (I'd introduce a `data_only` flag to prevent confusion.)
>>>>>
>>>>> Yeah, looks like that should be an or.
>>>>>
>>>>>> In the ReplicationManager.check_repl_init reporting: use
>>>>>> int(d.total_seconds()) instead of d.seconds, as the latter doesn't
>>>>>> include full days. I don't think anyone would wait long enough for
>>>>>> the
>>>>>> overflow, but better to be correct.
>>>>>
>>>>> Sure, I doubt anyone would wait 24 hours either but its a
>>>>> no-brainer to
>>>>> make it safe.
>>>>
>>>> I think I've addressed just about everything.
>>>>
>>>> The reason that /var/run/dirsrv and var/lock/dirsrv are included in the
>>>> backup is for the case of a full restore. These directories are not
>>>> created/provided by the package itself, but by installing the first
>>>> instance, so we need the ownership/SELinux context preserved.
>>>>
>>>> One thing I need tested is restoring over top of new data with multiple
>>>> replicas. So basically install, do a backup, add a bunch of data,
>>>> restore, re-initialize all the other masters and then confirm that ALL
>>>> the new data is gone. I think I've got the sequence right but this is
>>>> critical.
>>>>
>>>> This should work on fresh installs and upgrades from 3.0 (e.g. dogtag
>>>> 9/multi-instance DS configurations).
>>>>
>>>> One thing I tested was:
>>>>
>>>> - ipa-server-install
>>>> - ipa-backup
>>>> - ipa-server-install --uninstall -U
>>>> - ipa-restore ipa-full-...
>>>>
>>>> This will actually get your server back EXCEPT that dogtag won't start.
>>>> This is because the log directories are created by the instance
>>>> creation. There are two solutions: backup with --logs or manually
>>>> re-create the log directories (there are several, depending on dogtag
>>>> version).
>>>
>>> Could backup without --logs save which directories are there, and
>>> restore create empty directories if they don't exist? To me
>>> re-initializing a completely wiped machine looks like a big gain for the
>>> extra effort.
>>
>> I went ahead and made this change, it wasn't a lot of work.
>>
>>>
>>>
>>>
>>> The spec changelog needs a small rebase.
>>
>> Done.
>>
>>>
>>> I've tested some scenarios and didn't find any other issues so far.
>>>
>>> I still want to test some more with upgraded masters; installing them
>>> takes some time.
>>> Also I still need to test CA replicas (with the DNAME patches removed).
>>>
>>
>> rob
>
> I found a bug with the following topology:
>
> [IPA 2.2] <-> [IPA 3.2 upgraded from 2.2] <-> [IPA 3.2]
>
> Running ipa-restore on the 3.2 instance tries to disable the CA
> replication agreement between the other two.
> However, deep in ReplicationManager.agreement_dn it uses its own DS
> instance name instead of the Dogtag-9-DS one, so the agreement isn't
> found and the restore fails.
>
>

Yeah, I'm not 100% sure how to address that either. Is it safe to assume 
that if the port is 7389 then the instance is pki-ca? Or should we test 
for the existence of the instance name like we do master/clone?

I've also figured out why Update in Progress loops forever. It is 
because the older ipa-replica-manage do not re-enable the replication 
agreement, so replication can't continue. I'm not entirely sure how we 
will need to go about fixing this, except through documentation.

rob




More information about the Freeipa-devel mailing list