[Freeipa-devel] [PATCH 24/24] Add utility classes for handling DN's along with their, unittest.

Mon Jun 20 19:55:59 UTC 2011

On 06/20/2011 10:01 AM, Rob Crittenden wrote:
> Am I misreading the documentation on how one can create a DN?
>
>   >>>  print container
> cn=users,cn=accounts
>   >>>  print basedn
> dc=example,dc=com
>   >>>  str(DN(container, basedn))
> 'cn=users,cn=accounts=dc\\=example\\,dc\\=com'
>   >>>  uid='rcrit'
>   >>>  rdnattr='uid'
>   >>>  str(DN('%s=%s' % (rdnattr, uid), container, basedn))
> 'uid=rcrit=cn\\=users\\,cn\\=accounts,dc=example,dc=com'

Either you misread the documentation, or I wrote it poorly. In either 
case it's obvious it needs to be reworked to be clearer. Let me take 
another crack at explaining :-)

[Caveat: I've made some simplifying assumptions below, e.g. RDN's can be 
multi-valued, the classes handle everything correctly but only if you 
use them properly, if you're working with multi-valued RDN's you'll have 
to dig just a tad deeper to use the classes correctly.]

When you supply a sequence of strings those strings are assumed to be 
the type (e.g. name) and value of a RDN. But of course since they must 
be pairs the parser looks for adjacent pairs of strings in the sequence. 
So taking your example cn=users forms the first RDN, thus:

   'cn', 'users'

is the <type,value> pair of an RDN.

If that were followed by:

   'cn', 'accounts'

the next RDN would be: cn=accounts and the sequence:

   'cn', 'users', 'cn', 'accounts'

would produce:

   cn=users,cn=accounts

O.K. so why wouldn't you just say:

   'cn=users,cn=accounts'

instead of the 2 pairs:

   'cn', 'users', 'cn', 'accounts'

it's so much simpler right?

The reason, and this is key, is because 'cn=users,cn=accounts' is DN 
syntax and is subject to DN encoding rules. What is on the left and 
right side of the equal sign may NOT be the string values you expect 
them to be, rather they might be encoded. The only way to treat the LHS 
and RHS of an RDN as the ORIGINAL strings you're expecting is to 
reference them individually via the classes in the module. The classes 
know how to encode and decode and they can do it in a "smart" fashion.

It's NEVER a good idea to construct DN's from DN strings. Why? Because 
DN strings are subject to various escaping rules which after being 
applied produces what I call the encoded value of the DN. To complicate 
matters different encodings can produce the same DN. Once you get into 
these edge cases most simple expectations go out the window.

The simple coding answer is to always work with DN, RDN, or AVA objects 
and never with DN string syntax. The objects are aware of each other and 
perform the correct class conversions. The only time you need DN string 
syntax is at the moment you pass the DN into a LDAP library, and that is 
as simple as calling str() on the object.

O.K. so why do the classes accept DN syntax, you just told me never to 
use it! Well welcome to the real world, where not everything has been 
converted to use the new classes yet and the reality is sometimes you 
get strings in DN syntax. We don't want to be so rigid we barf, rather 
than being pedantic we support DN syntax but it comes with a GIANT 
WARNING of programmer beware, use at your own risk only if you know what 
you're doing.

So if DN syntax is a string and the type and value of an RDN are also 
strings how do the classes tell the difference when it's looking at a 
sequence of values used to construct a DN? It does it by looking for 
contiguous pairs of strings in the sequence, when it finds two adjacent 
strings it pulls them from the sequence and forms an RDN from them. A 
string is interpreted as DN syntax to be independently parsed if and 
only if it's not a member of a pair of strings in the sequence. Recall 
the sequence can include DN, RDN and AVA classes as well as strings.

Thus in your case what happened was you had two strings in the 
constructor sequence:

'cn=users,cn=accounts', 'dc=example,dc=com'

and that got interpreted as the LHS and RHS of an RDN.

The right way to have done this would have been to construct two DN's, 
one for the base and one for the container, for example:

base_dn = DN('dc', 'example', 'dc', 'com')
container_dn = DN('cn', 'users, 'cn', 'accounts')

then any new DN can be constructed via:

user_dn = DN('cn', 'Bob', container_dn, base_dn)

Make sense?

Note the syntax for constructing the DN objects is very flexible, you 
could build it up from a sequence of RDN objects or you could put the 
values in a list and pass the list to the constructor, e.g.

base_dn_list = ['dc', 'example', 'dc', 'com']
base_dn = DN(*base_dn_list)

or even:

base_dn_list = [RDN('dc', 'example'), RDN('dc', 'com')]
base_dn = DN(*base_dn_list)

> The patch requires one very minor change, the import from dn should be
> from ipalib.dn import ... We run the tests from the top-level.

O.K. will do. Also I added some new functionality I discovered was 
useful when I was making other fixes, such as the ability to use 
in-place addition (+= operator) and concatenation (+ operator) with DN 
syntax on the RHS. The unit test was enhanced to support those cases. 
I'll resubmit the patch with better doc (please comment on what was 
clear and what was not clear), the import fix, and the enhancements I 
just mentioned.

-- 
John Dennis <jdennis at redhat.com>

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/