[Freeipa-devel] DNS zone serial number updates [#2554]

Petr Spacek pspacek at redhat.com
Wed Apr 18 15:21:04 UTC 2012


On 04/18/2012 04:04 PM, Simo Sorce wrote:
> On Wed, 2012-04-18 at 15:29 +0200, Petr Spacek wrote:
>> Hello,
>>
>> first of all - snippet moved from the end:
>>   >  I think we need to try to be more consistent than what we are now. There
>>   >  may always be minor races, but the current races are too big to pass on
>>   >  IMHO.
>>   >
>> I definitely agree. Current state = completely broken zone transfer.
>>
>> Rest is in-line.
>>
>> On 04/17/2012 06:13 PM, Simo Sorce wrote:
>>> On Tue, 2012-04-17 at 17:49 +0200, Petr Spacek wrote:
>>>> Hello,
>>>>
>>>> there is IPA ticket #2554 "DNS zone serial number is not updated" [1],
>>>> which is required by RFE "Support zone transfers in bind-dyndb-ldap" [2].
>>>>
>>>> I think we need to discuss next steps with this issue:
>>>>
>>>> Basic support for zone transfers is already done in bind-dyndb-ldap. We
>>>> need second part - correct behaviour during SOA serial number update.
>>>>
>>>> Bind-dyndb-ldap plugin handles dynamic update in correct way (each
>>>> update increment serial #), so biggest problem lays in IPA for now.
>>>>
>>>> Modifying SOA serial number can be pretty hard, because of DS
>>>> replication. There are potential race conditions, if records are
>>>> modified/added/deleted on two or more places, replication takes some
>>>> time (because of network connection latency/problem) and zone transfer
>>>> is started in meanwhile.
>>>>
>>>> Question is: How consistent we want to be?
>>>
>>> Enough, what we want to do is stop updating the SOA from bind-dyndb-ldap
>>> and instead update it in a DS plugin. That's because a DS plugin is the
>>> only thing that can see entries coming in from multiple servers.
>>> If you update the SOA from bind-dyndb-ldap you can potentially set it
>>> back in time because last write win in DS.
>>>
>>> This will require a persistent sarch so bind-dyndb-ldap can be updated
>>> with the last SOA serial number, or bind-dyndb-ldap must not cache it
>>> and always try to fetch it from ldap.
>>
>> Bind-dyndb-ldap has users with OpenLDAP. I googled a bit and OpenLDAP should
>> support Netscape SLAPI [3][4], but I don't know how hard is to code
>> interoperable plugin.
>> Accidentally I found existing SLAPI plugin for "concurrent" BIND-LDAP backend
>> project [5].
>
> I don't think we need to provide plugins for other platforms, we just
> need an optiono in bind-dyndb-ldap to tell it to assume the SOA is being
> handled by the LDAP server.
> For servers that do not have a suitable plugin bind-dyndb-ldap will keep
> working as it does now. In those cases I would suggest people to use a
> single master, but up to the integrator of the other solution.
>
>
>> Can we think for a while about another ways? I would like to find some (even
>> sub-optimal) solution without DS plugin, if it's possible and comparable hard
>> to code.
>
> Yes, as I said you may still do something with a persistent search, but
> I do not know if persistent searches are available in OpenLDAP either.
OpenLDAP has support for (newer and standardized) SyncRepl [6][7]. I plan to 
look into it and consider writing some compatibility layer for 
psearch/syncrepl in bind-dyndb-ldap. It should not be hard, I think.

> However with a persistent search you would see entries coming in in
> "real time" even replicated ones from other replicas, so you could
> always issue a SOA serial update. Of course you still need to check for
> SOA serial updates from other DNS master servers where another
> bind-dyndb-ldap plugin is running.
>
> You have N servers potentially updating the serial at the same time. As
> long as you do not update the serial just because the serial was itself
> updated you are just going to eat one or more serial numbers off.
>
> We also do not need to make it a requirement to have the serial updated
> atomically. If 2 servers both update the number to the same value it is
> ok because they will basically be both in sync in terms of hosted
> entries.
>
> Otherwise one of the servers will update the serial again as soon as
> other entries are received.
>
> If this happens, it is possible that on one of the masters the serial
> will be updated twice even though no other change was performed on the
> entry-set. That is not a big deal though, at most it will cause a
> useless zone transfer, but zone transfer should already be somewhat rate
> limited anyway, because our zones do change frequently due to DNS
> updates from clients.
SOA record has also refresh, retry and expiry fields. These define how often 
zone transfer should happen. It's nicely described in [8].

There is next problem with transfers: Currently we support only full zone 
transfers (AXFR), not incremental updates (IXFR), because there is no "last 
record change"<->SOA# information. For now it's postponed, because nobody 
wanted it.

>>>>    Can we accept these
>>>> absolutely improbable race conditions? It will be probably corrected by
>>>> next SOA update = by (any) next record change. It won't affect normal
>>>> operations, only zone transfers.
>>>
>>> Yes and No, the problem is that if 2 servers update the SOA
>>> independently you may have the serial go backwards on replication. See
>>> above.
>>>
>>>> (IMHO we should consider DNS "nature": In general is not strictly
>>>> consistent, because of massive caching at every level.)
>>>
>>> True, but the serial is normally considered monotonically increasing.
>
>> I agree. How DS will handle collisions? When same attribute is modified
>> independently at two places? It's simply overwritten by one of values? I can't
>> find information about this at directory.fedoraproject.org.
>
> Last update wins.
Good to know, thanks. I wrongly expected some kind of warning mechanism. 
(Special operational attribute or something like that.)

>> (Side question: It's a real big problem? If it's result of very improbable
>> race condition? It will broke zone transfer, but next zone update will correct
>> this. As result of this failure last change is not transferred to slave DNS
>> servers, before another zone update takes place.)
>
> If there are no new updates the next zone transfer will see again a
> serial in the past and not update at all. So, yeah I think it is a big
> deal if the SOA goes backwards.
>
>>>> If it's acceptable, we can suppress explicit SOA serial number value in
>>>> LDAP and derive actual value from latest modifyTimestamp value from all
>>>> objects in cn=dns subtree. This approach saves some hooks in IPA's LDAP
>>>> update code and will save problems with manual modifications.
>>>
>>> It will cause a big search though. It also will not take in account when
>> If we use persistent search it's not a problem. Persistent search dumps whole
>> DB to RBT in BIND's memory and only changes are transferred after this initial
>> query.
>
> You still need to search the whole cache and save additional data. (I
> sure hope you do not keep in memory the whole ldap object but a parsed
> version of it, if you keep the whole LDAP object I think we just found
> another place for enhancement. Wasting all that memory is not a good
> idea IMO).
Only DNS records are stored, i.e. parsed objects.

Please, can you explain "You still need to search the whole cache and save 
additional data."? I probably missed some important point.

>> We can compute maximal modifyTimestamp from all idnsRecord objects during
>> initial query. Any change later in time will add only single attribute to
>> transfer + max(currvalue, newvalue) operation. This way (with psearch) has
>> reasonably small overhead, I think.
>
> The problem is that max(currvalue, newvalue) is not useful.
>
> This is the scenario:
>
> time 1: server A receives updates
> time 2: server B receives updates
> time 3: bind-dyndb-ldap B computes new SOA
> time 4: server A sends its updates to server B
> time 5: bind-dyndb-ldap B see that the max timestamp has not changed
> (all new entries are older than 'time 2' as they were generated at time
> 1).
>
> This is with perfectly synchronized clocks. If A has a clock slightly in
> the past compared to B then you could eve swap time 1 and 2 in absolute
> time and still get entries "in the past" at point 4.
>
> This is why using the modifyTimestamp is not workable in this case.
Ok, I didn't realized these problems. Now I know why DNS has single master :-D

>>> there are changes replicated from another replica that are "backdated"
>>> relative to the last modifyTimestamp.
>> If we maintain max(modifyTimestamp) value whole time, new backdated values
>> will not backdate SOA, because max(modifyTimestamp) can't move back.
>
> no the problem is not backdating the SOA serial, the problem is *not*
> updating it when new entris become available because they were "in the
> past". So if no other changes are made to DNS a zone transfer may not
> kick at all indefinitely even though the master has new/changed entries.
> This would cause a long term de-synchronization of the slaves I think is
> not really acceptable.
I agree with your long-term de-synchronization point, but with dynamic updates 
is not really probable.

>> It's not correct behaviour also, I know. But again: It's result of improbable
>> race condition and next zone update will correct this situation.
>
> It is not improbable at all, I think it would be a pretty common
> situation when you have different masters updating the same zone (common
> on the main zone), see explanation above.
>
>>> Also using modifyTimestamp would needlessly increment the SOA if there
>>> are changes to the entries that are not relevant to DNS (like admins
>>> changing ACIs, or other actions like that).
>> I think it's not a problem. Only consequence is unnecessary zone transfer. How
>> often admin changes ACI?
>
> Not often, so I concede the point.
>
>> If we want to save this overhead, we can count max(modifyTimestamp) only for
>> idnsRecord objects (and only for known attributes) - but I think it's not
>> necessary.
>
> I was already expecting that, but you cannot distinguish modifyTimestamp
> per attribute, only per object, so if modifyTimestamp is changed for an
> attribute you do not care about you still have to count it.
AFAIK you can watch changes only for selected attributes (through psearch).

>> There are still problems to solve without DS plugin (specifically
>> mapping/updating NN part from YYYYMMDDNN), but: Sounds this reasonable?
>
> Well I am not sure we need to use a YYYYMMDDNN convention to start with.
> I expect with DYNDNS updates that a 2 digit NN will never be enough,
> plus it is never updated by a human so we do not need to keep it
> readable. But I do not care eiither way, as long as the serial can
> handle thousands of updates per day I am fine (if this is an issue we
> need to understand how to update the serial in time intervals).
Current BIND implementation handles overflow in one day gracefully:
2012041899 -> 2012041900
So SOA# can be in far future, if you changes zone too often :-)

AFAIK this format is traditional, but not required by standard, if arithmetic 
works. [9] defines arithmetic for SOA serials, so DS plugin should follow it.

It says "The maximum defined increment is 2147483647 (2^31 - 1)"
This limit applies inside to one SOA TTL time window (so it shouldn't be a 
problem, I think). I didn't looked into in this RFC deeply. Some practical 
recommendations can be found in [10].

Thanks for your time.

Petr^2 Spacek

[6] http://www.openldap.org/doc/admin24/replication.html
[7] http://tools.ietf.org/html/rfc4533
[8] http://www.zytrax.com/books/dns/ch8/soa.html
[9] http://tools.ietf.org/html/rfc1982
[10] http://www.zytrax.com/books/dns/ch9/serial.html


> Simo.




More information about the Freeipa-devel mailing list