[Freeipa-devel] DNSSEC support design considerations: migration to RBTDB

Fri Jun 21 14:19:48 UTC 2013

On Thu, 2013-06-20 at 14:30 +0200, Petr Spacek wrote:
> Hello,
> 
> On 23.5.2013 16:32, Simo Sorce wrote:
> > On Thu, 2013-05-23 at 14:35 +0200, Petr Spacek wrote:
> >> It looks that we agree on nearly all points (I apologize if
> >> overlooked
> >> something). I will prepare a design document for transition to RBTDB
> >> and then
> >> another design document for DNSSEC implementation.
> 
> The current version of the design is available at:
> https://fedorahosted.org/bind-dyndb-ldap/wiki/BIND9/Design/RBTDB

Great write-up, thanks.

> There are several questions inside (search for text "Question", it should find 
> all of them). I would like to get your opinion about the problems.
> 
> Note that 389 DS team decided to implement RFC 4533 (syncrepl), so persistent 
> search is definitely obsolete and we can do synchronization in some clever way.

Answering inline here after quoting the questions for the doc:

        > Periodical re-synchronization
        >
        > Questions

              * Do we still need periodical re-synchronization if 389 DS
                team implements RFC 4533 (syncrepl)? It wasn't
                considered in the initial design.

We probably do. We have to be especially careful of the case when a
replica is re-initialized. We should either automatically detect that
this is happening or change ipa-replica-manage to kick named some how.

We also need a tool or maybe a special attribute in LDAP that is
monitored so that we can tell  bind-dyndb-ldap to do a full rebuild of
the cache on demand. This way admins can force a rebuild if they end up
noticing something wrong.

              * What about dynamic updates during re-synchronization?

Should we return a temporary error ? Or maybe just queue up the change
and apply it right after the resync operation has finished ?

              * How to get sorted list of entries from LDAP? Use LDAP
                server-side sorting? Do we have necessary indices?

We can do client side sorting as well I guess, I do not have a strong
opinion here. The main reason why you need ordering is to detect delete
records right ? Is thee a way to mark rdtdb records as updated instead
(with a generation number) and then do a second pass on the rbtdb tree
and remove any record that was not updated with the generation number ?
This would also allow us to keep accepting dynamic updates by simply
marking records as generation+1 so that the resync will not overwrite
records that are updated during the resync phase.

        > (Filesystem) cache maintenance

        > Questions: How often should we save the cache from operating
        memory to disk?

Prerequisite to be able to evaluate this question. How expensive is it
to save the cache ? Is DNS responsive during the save or does the
operation block updates or other functionality ?

              * On shutdown only?

NACK, you are left with very stale data on crashes.

              * On start-up (after initial synchronization) and on
                shutdown?

It makes sense to dump right after a big synchronization if it doesn't
add substantial operational issues. Otherwise maybe a short interval
after synchronization.

              * Periodically? How often? At the end of periodical
                re-synchronization?

Periodically is probably a good idea, if I understand it correctly it
means that it will make it possible to substantially reduce the load on
startup as we will have less data to fetch from a syncrepl requiest.

              * Each N updates?

I prefer a combination of each N updates but with time limits to avoid
doing it too often.
Ie something like every 1000 changes but not more often than every 30
minutes and not less often than 8 hours. (Numbers completely made up and
need to be tuned based on the answer about the prerequisites question
above).

              * If N % of the database was changed? (pspacek's favorite)

The problem with using % database is that for very small zones you risk
getting stuff saved too often, as changing a few records quickly makes
the % big compared to the zone size. For example a zone with 50 records
has a 10% change after just 5 records are changed. Conversely a big zone
requires a huge amount of changes before the % of changes builds up
leading potentially to dumping the database too infrequently. Example,
zone with 100000 records, means you have to get 10000 changes before you
come to the 10% mark. If dyndns updates are disabled this means the zone
may never get saved for weeks or months.
A small zone will also syncrepl quickly so it would be useless to save
it often while a big zone is better if it is up to date on disk so the
syncrepl operation will cost less on startup.

Finally N % is also hard to compute. What do you consider into it ?
Only total number of record changed ? Or do you factor in also if the
same record is changed multiple times ?
Consider fringe cases, zone with 1000 entries where only 1 entry is
changed 2000 times in a short period (malfunctioning client (or attack)
sending lots of updates for their record.

Additional questions:

I see you mention:
"Cache non-existing records, i.e. do not repeat LDAP search for each
query"

I assume this is fine and we rely on syncrepl to give us an update and
override the negative cache if the record that has been negatively
cached suddenly appears via replication through another master, right ?

If we rely on syncrepl, are we going to ever make direct LDAP searches
at all ? Or do we rely fully on having it send us any changes and
therefore we always reply directly from the rbtdb database ?

Simo.

-- 
Simo Sorce * Red Hat, Inc * New York