[Freeipa-devel] [PATCH] WIP backup and restore

Petr Viktorin pviktori at redhat.com
Mon Apr 8 13:40:13 UTC 2013


On 04/05/2013 10:54 PM, Rob Crittenden wrote:
> Petr Viktorin wrote:
>> On 04/04/2013 03:04 PM, Rob Crittenden wrote:
>>> Rob Crittenden wrote:
>>>> Petr Viktorin wrote:
>>>>> On 03/23/2013 05:06 AM, Rob Crittenden wrote:
>>>>>> There are strict limits on what can be restored where. Only exact
>>>>>> matching hostnames and versions are allowed right now. We can
>>>>>> probably
>>>>>> relax the hostname requirement if we're only restoring data, and the
>>>>>> version perhaps for only the the first two values (so you can
>>>>>> restore a
>>>>>> 3.0.0 backup on 3.0.1 but not on 3.1.0).
>>>>>
>>>>> Do we also need to limit the versions of Dogtag, 389, Kerberos...?
>>>>> Or is what they put in /var/lib guaranteed portable across versions?
>>>>
>>>> An interesting question. I'd imagine that a major db change would also
>>>> require a major update to IPA, therefore if we restrict by IPA version
>>>> we should be ok. I AM doing an untar of files though so yeah, there
>>>> is a
>>>> risk here.
>>>>
>>>>>
>>>>>
>>>>> That's good to hear!
>>>>>
>>>>> I think while developing, you should run with -v to get all the
>>>>> output.
>>>>> And for production, please have the commands log by default (set
>>>>> log_file_name).
>>>>
>>>> Yes, I plan on adding that in the future.
>>>>
>>>>>
>>>>>> ipa-backup does all its binds using ldapi. ipa-restore needs the DM
>>>>>> password because we need to contact remote servers to enable/disable
>>>>>> replication.
>>>>>
>>>>> The restore assumes that ipa is already installed, right? I can't just
>>>>> run it on a clean machine? Id would be good to check/document this.
>>>>
>>>> It depends.
>>>>
>>>> For a full backup you can actually restore to an uninstalled server. In
>>>> fact, you could restore to a machine w/no IPA bits on it at all in all
>>>> likelihood (which wouldn't be very nice, but not exactly wrong either
>>>> IMHO).
>>>>
>>>> I tested this with:
>>>>   - ipa-server-install
>>>>   - ipa-backup
>>>>   - ipa-server-install --uninstall -U
>>>>   - ipa-restore
>>>>   - ran the unit tests
>>>>
>>>>> I looked in the backup tarball and saw dirsrv PID file and lock
>>>>> directories. Are these needed?
>>>>
>>>> Probably not. I gathered some of the files to backup based on watching
>>>> what files that changed during an install that are independent of us.
>>>> I'll take another pass through it, there may be other oddities too.
>>>>
>>>>> The tool runners (install/tools/ipa-backup and
>>>>> install/tools/ipa-restore) are missing, so IPA doesn't build. Probably
>>>>> just a forgotten git add.
>>>>
>>>> Yup.
>>>>
>>>>>
>>>>> The patch adds an extra backslash in install/tools/man/Makefile.am;
>>>>> consider adding $(NULL) at the end.
>>>>
>>>> I'll take a look.
>>>>
>>>>>
>>>>> Backup.dirs, Backup.files, Backup.logs:
>>>>> The code modifies these class-level attributes. This means you can't
>>>>> run
>>>>> the command more than once in one Python session, so it can't be
>>>>> used as
>>>>> a library (e.g. in a hypothetical test suite).
>>>>> Please make the class atrributes immutable (tuples), then in
>>>>> __init__ do
>>>>> `self.dirs = list(self.dirs)` to to get instance-specific lists.
>>>>
>>>> Ok, fair point.
>>>>
>>>>> Code that creates temporary files/directories or does chdir()
>>>>> should be
>>>>> next to (or in) a try block whose finally: restores things.
>>>>
>>>> Yes, I mean to add a big try/finally block around everything in run
>>>> eventually (or the equivalent anyway).
>>>>
>>>>>
>>>>> Instead of star imports (from ipapython.ipa_log_manager import *),
>>>>> please explicitly import whatever you're using from that package. In
>>>>> this case, nothing at all!
>>>>
>>>> Yeah, I haven't run this through pylint yet to see all the bad
>>>> imports I
>>>> cobbled together.
>>>>
>>>>> If there's a get_connection() method, it should return the connection,
>>>>> and it should always be used to get it. Store the connection in
>>>>> self._conn, and rather than:
>>>>>      self.get_connection()
>>>>>      self.conn.do_work()
>>>>> do:
>>>>>      conn = self.get_connection()
>>>>>      conn.do_work()
>>>>> This makes forgetting to call get_connection() impossible.
>>>>
>>>> My purpose was to avoid creating multiple connections.
>>>>
>>>>> If a variable is only used in a single method, like `filename` in
>>>>> Restore.extract_backup, don't store it in the admintool object.
>>>>> In general, try to avoid putting data in self if possible. It's
>>>>> convenient but it allows complex interdependencies.
>>>>> (Yes, I'm guilty of not following this myself.)
>>>>
>>>> Yes, there is certainly a bit of cleanup to do. I was in a bit of a
>>>> rush
>>>> to get this WIP out.
>>>>
>>>>> In several places, the backup tool logs critical errors and then just
>>>>> continues on. Is that by design? I think a nonzero exit code would be
>>>>> appropriate.
>>>>
>>>> I'll take a look at them, all I can say at this point is maybe.
>>>>
>>>>> In the ipa-restore man page, --backend is not documented.
>>>>>
>>>>> In db2ldif, db2bak, etc., a more conscise way to get the time
>>>>> string is
>>>>> `time.strftime('export_%Y_%m_%d_%H_%M_%S')`.
>>>>>
>>>>> When validating --gpg-keyring, it would be good to check both files
>>>>> (.sec and .pub).
>>>>
>>>> Ok, I can do those.
>>>>
>>>>>
>>>>> Here:
>>>>>      if (self.backup_type != 'FULL' and not options.data_only and
>>>>>          not instances):
>>>>>          raise admintool.ScriptError('Cannot restore a data backup
>>>>> into
>>>>> an empty system')
>>>>> did you mean `(self.backup_type != 'FULL' or options.data_only) and
>>>>> not
>>>>> instances`? (I'd introduce a `data_only` flag to prevent confusion.)
>>>>
>>>> Yeah, looks like that should be an or.
>>>>
>>>>> In the ReplicationManager.check_repl_init reporting: use
>>>>> int(d.total_seconds()) instead of d.seconds, as the latter doesn't
>>>>> include full days. I don't think anyone would wait long enough for the
>>>>> overflow, but better to be correct.
>>>>
>>>> Sure, I doubt anyone would wait 24 hours either but its a no-brainer to
>>>> make it safe.
>>>
>>> I think I've addressed just about everything.
>>>
>>> The reason that /var/run/dirsrv and var/lock/dirsrv are included in the
>>> backup is for the case of a full restore. These directories are not
>>> created/provided by the package itself, but by installing the first
>>> instance, so we need the ownership/SELinux context preserved.
>>>
>>> One thing I need tested is restoring over top of new data with multiple
>>> replicas. So basically install, do a backup, add a bunch of data,
>>> restore, re-initialize all the other masters and then confirm that ALL
>>> the new data is gone. I think I've got the sequence right but this is
>>> critical.
>>>
>>> This should work on fresh installs and upgrades from 3.0 (e.g. dogtag
>>> 9/multi-instance DS configurations).
>>>
>>> One thing I tested was:
>>>
>>> - ipa-server-install
>>> - ipa-backup
>>> - ipa-server-install --uninstall -U
>>> - ipa-restore ipa-full-...
>>>
>>> This will actually get your server back EXCEPT that dogtag won't start.
>>> This is because the log directories are created by the instance
>>> creation. There are two solutions: backup with --logs or manually
>>> re-create the log directories (there are several, depending on dogtag
>>> version).
>>
>> Could backup without --logs save which directories are there, and
>> restore create empty directories if they don't exist? To me
>> re-initializing a completely wiped machine looks like a big gain for the
>> extra effort.
>
> I went ahead and made this change, it wasn't a lot of work.
>
>>
>>
>>
>> The spec changelog needs a small rebase.
>
> Done.
>
>>
>> I've tested some scenarios and didn't find any other issues so far.
>>
>> I still want to test some more with upgraded masters; installing them
>> takes some time.
>> Also I still need to test CA replicas (with the DNAME patches removed).
>>
>
> rob

I've tested some more. It works well with 3.2 masters, even one upgraded 
from 2.2.

When there's a 2.2 master in the topology, things are not so nice. When 
I ran ipa-replica-manage re-initialize, it started printing "update in 
progress" lines for several minutes with no signs of stopping.
The following appeared in the slapd error log on the source server:

[08/Apr/2013:13:38:48 +0200] NSMMReplicationPlugin - Replication 
agreement for agmt="cn=meTovm-084.idm.lab.eng.brq.redhat.com" 
(vm-084:389) could not be updated. For replication to take place, please 
enable the suffix and restart the server

I somehow doubt that's caused by this patch, though -- was replica 
re-initialize tested with disparate versions recently?
I can test again with a simpler scenario to pinpoint the issue.

-- 
Petr³




More information about the Freeipa-devel mailing list