[Freeipa-devel] [PATCH 0029-0046] Internationalized domain names in DNS plugin

Fri Apr 4 13:10:21 UTC 2014

On Thu, 2014-04-03 at 15:35 +0200, Jan Cholasta wrote:
> On 2.4.2014 14:07, Martin Basti wrote:
> > Helo list,
> >
> > this patchset allows to use internationalized domian in DNS plugin.
> > - dns names are stored in ACE form(punycoded) in LDAP
> > - raw option shows dns data in ACE form, otherwise dns names are
> > converted to unicode
> > - plugin allow all characters in domain name, which are valid by IDN
> > RFCs (almost everything including non-printable), should be validation
> > more restrictive? (there is bug in dnspython with special characters,
> > will be fixed soon)
> > - TODO update WebUI to support DNSName objects
> >
> > Required patches:
> > freeipa-jcholast-255-Allow-primary-keys-to-use-different-type-than-unicod.patch
> > freeipa-jcholast-256-Support-API-version-specific-RPC-marshalling.patch
> > freeipa-jcholast-257-Replace-get_syntax-method-of-IPASimpleObject-with-ne.patch
> > freeipa-jcholast-258-Use-raw-attribute-values-in-command-result-when-raw-.patch
> > freeipa-jcholast-259-Keep-original-name-when-setting-attribute-in-LDAPEnt.patch
> >
> >
> > Patches attached.
> >
> 
> First batch of comments, so far I have only read the code/patches, 
> without doing actual testing.
> 
> 
> Patch 30:
> 
> 1)
> 
> It might make sense to put all of this into a new module (e.g. 
> dnsutil.py) rather than ipautil.
> 
I will move it
> 
> 2)
> 
> +        if isinstance(labels, str):
> +            if not labels:
> +                raise ValueError('empty string')
> ...
> +        elif isinstance(labels, unicode):
> +            if not labels:
> +                raise ValueError('empty string')
> 
> It might be nicer to:
> 
> +        if isinstance(labels, basestring) and not labels:
> +            raise ValueError('empty string')
> +
> +        if isinstance(labels, str):
> ...
> +        elif isinstance(labels, unicode):
> 
Ok I will use basestring

> 
> 3)
> 
> +    def __nonzero__(self):
> +        return True
> 
> It would be nice to include a comment about why DNSName always evaluates 
> to True (mention "@").
> 
I will add comment
> 
> 4)
> 
> +    @staticmethod
> +    def get_root():
> +        return DNSName(dns.name.root)
> +
> +    @staticmethod
> +    def get_origin_sign():
> +        return DNSName(u'@')
> +
> +    @staticmethod
> +    def get_rev_zone():
> +        return DNSName(u'in-addr.arpa.')
> +
> +    @staticmethod
> +    def get_ip6_rev_zone():
> +        return DNSName(u'ip6.arpa.')
> 
> I think you should either drop the "get_" prefix from the name, or (even 
> better) make these global constants.
> 
> I would shorten "origin_sign" to just "sign".
IMHO only "sign" is not enough.

> 
> Can you please use tuples of str objects (i.e. what dns.name.Name uses 
> internally) instead of unicode objects for the initialization? I think 
> it should be the preferred style of initializing DNSName objects (DN 
> objects do the same).
> 
I will fix it.
> 
> 5)
> 
> +    def __str__(self):
> +        return super(DNSName, self).to_text()
> 
> You don't need to use super here.
> 
I will fix it.
> 
> 6)
> 
> +    def ToASCII(self, omit_final_dot=False):
> +        return super(DNSName, 
> self).to_text(omit_final_dot=omit_final_dot).decode('ascii')
> +
> +    def ToUnicode(self, omit_final_dot=False):
> +        return super(DNSName, 
> self).to_unicode(omit_final_dot=omit_final_dot).decode('utf-8')
> 
> What was the reason for the unusual naming again? I would prefer PEP-8 
> compatible names (e.g. "to_ascii" and "to_unicode"), but if the current 
> names absolutely have to stay, please add a comment with explanation.
> 
I used .ToUnicode .ToASCII because, python standard library uses this
naming(https://docs.python.org/2/library/codecs.html#module-encodings.idna) and also RFC3490 section 4 specifiy it in that way.

> I don't like the "omit_final_dot" flag. IMHO it should be dropped and 
> whether the result includes a final dot or not should depend solely on 
> whether the name is absolute or relative. You can still use e.g. 
> "name.derelativize(root).ToUnicode()" to drop the final dot, which is 
> more explanatory.
> 
> In ToUnicode, the call to dns.name.Name.to_unicode already returns a 
> unicode object, no need to call decode on it.
> 
I will remove the flag
> 
> 7)
> 
> +    def concatenate(self, other):
> +        return DNSName(super(DNSName, self).concatenate(other).labels)
> +
> +    def relativize(self, origin):
> +        return DNSName(super(DNSName, self).relativize(origin).labels)
> +
> +    def derelativize(self, origin):
> +        return DNSName(super(DNSName, self).derelativize(origin).labels)
> +
> +    def choose_relativity(self, origin=None, relativize=True):
> +        return DNSName(super(DNSName, 
> self).choose_relativity(origin=origin, relativize=relativize).labels)
> 
> Why use ".labels" here? The DNSName constructor knows how to deal with 
> dns.name.Name objects, right?
Right, I will fix it.

> 
> 8)
> 
> +    def is_ip_reverse(self):
> +        if self.is_subdomain(self.get_rev_zone()):
> +            return True
> +        return False
> +
> +    def is_ip6_reverse(self):
> +        if self.is_subdomain(self.get_ip6_rev_zone()):
> +            return True
> +        return False
> +
> +    def is_reverse(self):
> +        if self.is_ip_reverse() or self.is_ip6_reverse():
> +            return True
> +        return False
> 
> The ifs are all redundant. Return the result of the check directly 
> ("return self.is_subdomain ..." etc.)
Ok, I will fix it.
> 
> Patch 31:
> 
> 1)
> 
> +    kwargs = Param.kwargs + (
> +        ('require_absolute', bool, False),
> +        ('require_relative', bool, False),
> +    )
> 
> What about renaming these to 'only_absolute' and 'only_relative'? IMO it 
> better captures the meaning (yes I know we already discussed the naming 
> in length :-)
IMO there is still confusion only_absolute -> makes a relative name
absolute, only_relative -> raise error if a name is absolute

What about to_absolute, only_relative, (only_absolute maybe?)
> 
> 
> 2)
> 
> +                except (dns.name.LabelTooLong, UnicodeError):
> +                    #dnspython bug?, punycoded label longer than 63 
> ASCII-chars returns Unicode Error
> +                    #instead of LabelTooLong
> 
> If that's true, it should be handled in DNSName constructor, not here.
> 
Yes it is special case when name is shorter than 63 chars but after
conversion to ACE form is longer.
I will move it
> 
> 3)
> 
> All of the dns.name exceptions inherit from dns.exception.SyntaxError, I 
> think you should add an except for it as well.
> 
I will add this except.
> 
> 4)
> 
> +                #compare if IDN normalized and original domain match
> +                normalized_domain_name = encodings.idna.nameprep(value)
> +                if(value != normalized_domain_name):
> +                    raise ValueError( _("domain name '%(domain)s' and 
> normalized domain name "
> +                                            "'%(normalized)s' do not 
> match. Please use only "
> +                                            "normalized domains")  % 
> {'domain':value,
> + 'normalized':normalized_domain_name})
> 
> Can we instead try to normalize the name in the beginning of 
> _convert_scalar rather than raise an error later?
I'm not sure what you mean, I need pure string without any normalization
to compare it.
> 
> 
> 5)
> 
> +            except Exception as e:
> +                raise ValidationError(name=self.name,
> +                                  value=value,
> +                                  index=index,
> +                                  error=unicode(e))
> 
> Since this is _convert_scalar, I think this should be ConversionError 
> (even if the errors are technically validation errors).
Okay I will raise ValidationError.
> 
> Also, use self.get_param_name() instead of self.name.
OK.
> 
> 
> 6)
> 
> I'm not a fan of the layout of _convert_scalar, can you please use 
> something like this instead (includes changes requested in the previous 
> comments):
> 
>      def _convert_scalar(self, value, index):
>          if isinstance(value, unicode):
>              value = encodings.idna.nameprep(value)
> 
>              error = None
>              try:
>                  value = DNSName(value)
>              except dns.name.BadEscape:
>                  error = _("invalid escape code")
>              ...
>              except dns.exception.SyntaxError:
>                  error = _("invalid syntax")
> 
>              if error:
>                  raise ConversionError(
>                      name=self.get_param_name(), index=index, error=error)
> 
>          return super(DNSNameParam, self)._convert_scalar(value, index)
I will.
> 
> 
> Patch 33:
> 
> Why is this patch necessary? It does not seem right to me.
> 
It has to be there.
Without this patch I'm getting conversion errors, values are returned in
tuples instead of list
Str()._convert_scalar is called somewhere out of dns
I have to take deep inspect where it occurs.

> 
> Patch 34:
> 
> This patch should be squashed with patch 32. They don't make sense 
> without each other.
> 
I will do it.

> Patch 37:
> 
> 1)
> 
> -def _rname_validator(ugettext, zonemgr):
> 
> What's up with this renaming of everything? Don't do it please, unless 
> absolutely necessary.
Cos I used both at start, IDN aware and unaware, some external plugins
also call those functions (not in this case)

> 
> 2)
> 
> You don't need to redefine has_output for dnszone_add, dnszone_mod, 
> dnszone_show, dnsrecord_add, dnsrecord_mod and dnsrecord_show. They 
> already use standard_entry.
> 
> 
> 3)
> 
> +    def post_callback(self, ldap, dn, entry_attrs, *keys, **options):
> +        assert isinstance(dn, DN)
> +        _ns_zone_post_convert(entry_attrs, **options)
> +        return dn
> +
> 
> Please don't use stuff that is not defined in this or previous patches. 
> IPA should be at least buildable after each patch is applied.
> 
My bad sorry, I added it to wrong patch after rebase
> 
> 4)
> 
>       has_output = output.standard_value
> +
>       msg_summary = _('Disabled DNS zone "%(value)s"')
> 
> No new whitespace please (seen 2 times in the patch).
I will remove it.
> 
> Patch 38:
> 
> 1)
> 
> +_dns_zone_record = DNSName(u'@')
> 
> You know you already defined a constant for this in patch 30, right?
> 
> 
> 2)
> 
> No new whitespace please (seen 2 times in the patch).
I will remove it
> 
> 3)
> 
> +    #TODO This is used by realmdomains.py (but it shouldnt allow 
> classless), not used in DNS
> 
> +    #TODO This is used by Host.py
> 
> I don't think you should add these comments, since they are resolved in 
> patches 41 and 42.
I forgot to remove them.

> 
> 4)
> 
> +def _hostname_validator_idn(ugettext, value):
> +    assert isinstance(value, DNSName)
> +
> +    req_len = 2
> +    if value.is_absolute():
> +        req_len = 3
> +    if len(value.labels) < req_len:
> +        return _('invalid domain-name: not fully qualified')
> +
> +    return None
> 
> Why the "_idn" postfix? It does not seem to be related to IDN.
> 
> However, it does seem like it does the same thing as 
> _hostname_validator, so I think they should be merged.
I will change all names to original in DNS plugin.

> 
> 5)
> 
>               zonename = zone['idnsname'][0]
> -            if revdns.endswith(zonename) and len(zonename) > len(revzone):
> -                revzone = zonename
> +            if revdns.is_subdomain(zonename.make_absolute()):
> +                if revzone is None:
> +                    revzone = zonename
> +                elif zonename.is_subdomain(revzone):
> +                    revzone = zonename
> 
> The idnsname returned from dnszone_find should already be absolute, so 
> why "zonename.make_absolute()"?
> 
> You can shorten the condition to:
> 
>      if revdns.is_subdomain(zonename) and (revzone is None or 
> zonename.is_subdomain(revzone))
> 
No, it depends what is stored in LDAP, dnszone_find could return
relative domain name (it should make sure everywhere that zone are
absolute before comparation due to compatibility to older versions (or
manual changes in LDAP))
> 
> 6)
> 
> -        revzone = u'.'.join(items[pos:])
> +        revzone = DNSName(u'.'.join(items[pos:]))
> 
> You join the items here into string which is split again in DNSName. Use 
> this instead:
> 
>      revzone = DNSName(items[pos:])
Sorry I just add add constructor there. I will fix it.
> 
> 7)
> 
> -    return revzone, revname
> +    return DNSName(revzone), DNSName(revname)
> 
> Aren't both revzone and revname DNSName objects already?
They are, sorry, I forgot to remove it
> 
> 8)
> 
> -            reason=_('DNS zone %(zone)s not found') % dict(zone=domain)
> +            reason=_('DNS zone %(zone)s not found') % 
> dict(zone=domain.ToUnicode())
> 
> Is the ".ToUnicode()" necessary? (seen 4 times in the patch)
If strings are Unicode, then not. I will fix it.
> 
> 9)
> 
> IMO normalize_zonemgr_idn and validate_zonemgr_idn should be merged into 
> normalize_zonemgr and validate_zonemgr, respectively.
Ok

> 
> 10)
> 
>   def zone_is_reverse(zone_name):
> -    zone_name = normalize_zone(zone_name)
> -    if any(zone_name.endswith(name) for name in REVERSE_DNS_ZONES):
> -        return True
> +    if(isinstance(zone_name, DNSName)):
> +        if any(zone_name.is_subdomain(DNSName(name)) for name in 
> REVERSE_DNS_ZONES):
> +            return True
> +    else:
> +        zone_name = normalize_zone(zone_name)
> +        if any(zone_name.endswith(name) for name in REVERSE_DNS_ZONES):
> +            return True
> 
>       return False
> 
> Something like this would definitely be better:
> 
>      def zone_is_reverse(zone_name):
>          if not isinstance(zone_name, DNSName):
>              zone_name = DNSName(zone_name)
>          return zone_name.is_reverse()
Thank you, I will replace it.

> 
> Patch 39:
> 
> 1)
> 
> -        Str('hostname',
> -            _hostname_validator,
> -            normalizer=_normalize_hostname,
> +        DNSNameParam('hostname',
> +            #RFC 2317 section 5.2 -- don't have to be FQDN
> 
> It seems you are changing behavior here. Such things belong in separate 
> patches.
I will move it on extra patch.

> 
> Patch 40:
> 
> 1)
> 
> +            addr = DNSName(u'')
> 
> This will always raise ValueError with the current implementation of 
> DNSName.
My bad, there should be zone_record constant (@), it works before I changed validation to not support empty strings
> 
> 2)
> 
> No new whitespace please (seen 5 times in the patch).
Sorry for that
> 
> Patch 43:
> 
> I think this patch should be squashed into patch 38.
I will do that
> 
> Patch 44:
> 
> This patch should be squashed into patch 39.
I will do that
> 
> 1)
> 
> +<<<<<<< HEAD
>   output: PrimaryKey('value', None, None)
> +=======
> +output: Output('value', <class 'ipapython.ipautil.DNSName'>, None)
> +>>>>>>> 5df7ed2... API change
> 
> It seems you overlooked this.
Shame on me!


> This is all for now. Expect more later ;-)
> 
> 
> Honza
> 
Thank you very much for your time.

-- 
Martin^2 Basti