[Freeipa-devel] DN patch and documentation

Thu Jul 26 19:59:28 UTC 2012

On 07/11/2012 03:59 AM, Petr Viktorin wrote:
> I went over the changes to ipaserver/plugins/ldap2.py; now I can start
> the testing.
>
> I must ask about the wrapped _ext methods. I assume python-ldap exposes
> them only to mimic the C interface, which has them because C doesn't
> have default arguments. Would we lose anything if the class only defined
> the _ext methods, and aliased the non-ext variants to them? e.g.
>
>       def add_ext(self, dn, modlist, serverctrls=None, clientctrls=None):
>           assert isinstance(dn, DN)
>           dn = str(dn)
>           modlist = self.encode(modlist)
>           return self.conn.add_ext(dn, modlist, serverctrls, clientctrls)
>       add = add_ext
>
> The non-_ext variants are deprecated in the underlying C library, anyway.

With the current implementation you are correct. There would be no 
functional difference if we only utilized the _ext variants. However 
that means employing knowledge from the wrong side of an API boundary. 
 From our perspective python-ldap exposes certain methods. In particular 
we use the simpler methods because that is sufficient to meet our needs. 
I think we should code to the defined API. If python-ldap ever changes 
or we use an alternate Python ldap library the porting effort would be 
much cleaner if we don't try to outsmart ourselves.

> Why are some of the wrapper methods left out? (compare, delete_ext,
> modify, etc.)

The set of wrapped methods was chosen based on what we actually use in 
our code. Originally I set out out wrap every method, but that was a 
fair amount of work most of which was completely unnecessary because it 
would never get invoked. Why make it more complicated than it needs to 
be? If in the future we decide we need to use another method(s) we can 
add the wrapper on an as-needed basis.

>
>
> Line 1519:
>       checkmembers = copy.deepcopy(members)
>       for member_dn in checkmembers:
>           ...
>           if m not in checkmembers:
>               checkmembers.append(m) # FIXME jrd; can't modify list
> during traversal
> NACK. You really can't modify lists during traversal, a FIXME won't cut it.

Absolutely, I didn't write the code but when I saw it I realized it had 
to get fixed, the FIXME was simply a reminder to go back and clean up 
previously incorrect code I had spotted but had existed for a long time. 
Your solution below is pretty close to what I was planning to add. Fixed.

> A smaller issue is that the list `in` operator does alinear scan, so
> calling it in a loop can very slow as the list grows. Sets are much
> better for membership tests.
> So I believe you want something like:
>
>       checkmembers = set(DN(m, locked=True) for m in members)
>       checked = set()
>       while checkmembers:
>           member_dn = checkmembers.pop()
>           checked.add(member_dn)
>           ...
>           if m not in checked:
>               checkmembers.add(m)
>
>
>
> Lines 203, 825:
>       os.environ['KRB5CCNAME'] = ccache_file
> Should KRB5CCNAME be unset/restored once the code's done with it.

Agreed, once again I didn't write this but saw the problem when I moved 
the code block and made a note to fix it in the future. This particular 
place is fixed now. However I think we've got other places that do 
similar things. This is another example of the tendency to cut-n-paste 
blocks of duplicate code which drives me nuts. But how much unrelated 
code does one change in any patch? I did however add a FIXME comment to 
get that code cleaned up. In the past I would change any problem I saw 
when I touched an area of the code but I got a lot of complaints about 
changing things not directly related to the intent of the patch :-(

FWIW, the setting of KRB5CCNAME on line 825 is somewhat correct. It's 
the identity for the duration of the request. However it's redundant 
with what the session code sets up . ldap2 really shouldn't be doing 
this, ldap2 needs some clean up. FWIW the KRBCCNAME is removed by the 
session release_ipa_ccache() method.

>
>
> Line 228:
>           except IndexError:
> The try clause is too long for such a general exception, this can hide
> real errors.

Agreed, I didn't write the code, it needs clean up, how many non-dn 
related changes do I make?

>
>
>
> ====
>
> And as usual I have a lot of nitpicks. Sometimes I just want to share my
> ideas about how good code should look like, don't take them too
> seriously if you think otherwise :)
>
>
> Line 107 & 127:
>       return utf8_codec.decode(val)[0]
>       return utf8_codec.encode(unicode(val))[0]
> I believe the following would be more readable:
>       return val.decode('utf-8')
>       return unicode(val).encode('utf-8')

Yes, your suggestion is easier to read, but it's more efficient to 
lookup the codec once rather than everytime you decode.

>
>
> Line 134:
>       class ServerSchema(object):
>
> Don't use nested classes without reason

addressed previously

>
> Line 148:
>       def get_schema(self, url, conn=None, force_update=False):
>
> Is the force_update & flush() useful? None of the code seems to use it.

We don't currently use it but we probably should in the future. After we 
update schema we should refresh our view of it. Just anticipating what 
will probably be a bug somewhere down the road. At least this is better 
than what we had and is tending in the right direction.

>
>
> Line 354:
>       _SCHEMA_OVERRIDE = {
> We have a case-insensitive dict class, why not use it here?

Excellent suggestion, it's been changed.

>
>
> Lines 388, 750:
>       def is_dn_syntax(self, attr):
>           """
>           Check the schema to see if the attribute uses DN syntax.
> uses_dn_syntax or has_dn_syntax would be a better name.

Good suggestion. I made the change.

Actually, I wish LDAP attributes were a class, that way the correct 
comparisons could be automatically applied by Python based on the LDAP 
syntax of the particular attribute and things like "is_dn" and 
"is_multivalued" could be a properties of the attribute instance, etc. 
We are using an object oriented language aren't we?
>
>
> Line 395:
>           if syntax is None:
>               return False
>           return syntax == DN_SYNTAX_OID
> The if is redundant.

Good catch, absolutely right, fixed.

>
>
> Line 406:
>           if isinstance(val, bool):
>               if val:
>                   val = 'TRUE'
>               else:
>                   val = 'FALSE'
>               return self.encode(val)
> Encoding an ASCII string should be a no-op, why bother doing it?

Not sure what you mean by no-op. Do you mean str(val) would return 
either 'True' or 'False'?

In any event we are dependent on the string representation of a boolean 
being all uppper-case. Not something I invented.

>
> Line 421:
>       dct = dict()
>       for (k, v) in val.iteritems():
>           dct[self.encode(k)] = self.encode(v)
>       return dct
> Consider:
>       dct = dict((self.encode(k), self.encode(v)) for k, v in
> val.iteritems())

Why do the work of building a list of tuples when it's not necessary?

> Line 437:
>       if type(target_type) is type and isinstance(original_value,
> target_type):
> Don't use `type(x) is expected_type` for type checking! Use
> isinstance(target_type, type) so you also catch type's subclasses.

Good point, fixed.

   Or
> better yet, handle the TypeError that isinstance will raise if you don't
> give it a class.
> To be honest, I don't think it's worth it to do all this convoluted
> checking just to make sure you're not copying unnecessarily. Do you have
> a reason I can't see?

Because some constructors are expensive and it's good to pay attention 
to efficiency.

>
>
> Line 594, 604, 613:
>           ipa_result = self.convert_result(ldap_result)
>           return ipa_result
> Just return the result directly, no need for the extra variable.

It was done that way for two reasons. During development it gave me a 
value to print for debugging purposes and I thought it made the code 
clearer by indicating what is an IPA value. I don't think the extra 
variable introduces any significant inefficiencies (in C it would be 
optimized out, not sure Python's byte compiler does the same).

>
> Line 876, 894, 1041:
>       parent_dn -- DN of the parent entry (default '')
> The default is None. In any case it's redundant to specify default
> values in docstrings, as you can just look at the function itself.
>
>
> Line 1422:
>       self.handle_errors(e, **{})
> Wouldn't the following work better?
>       self.handle_errors(e, *[{}]*0)

I never liked the way handle_errors was written. The use of the kwds 
args is odd, convoluted and hard to understand.

To be honest I can't parse

*[{}]*0

>
>
> Line 1518:
>       checkmembers = copy.deepcopy(members)
> deepcopy() is rarely what you want; it either copies too much or too
> little. The Python developers made a mistake by assigning a short name
> to such an ill-specified operation. It's better to do this explicitly.
>       checkmembers = [DN(m) for m in members]
> Or to combine with my previous suggestion,
>       checkmembers = set(DN(m, locked=True) for m in members)
>

The use of deepcopy was never necessary, in fact this method had a 
number of programming errors (i.e. modification during list traversal), 
all that's been fixed now.

>
> Line 1609:
>       indirect = memberof[:]  # make a copy of the list
> I believe the list() is more readable — doesn't need a comment:
>       indirect = list(memberof)
>

Yup, more readable but you're bucking the Benevolent Dictator Guido who 
recommends the use of [:]. Anyway, fixed, list() is absolutely more 
readable.

-- 
John Dennis <jdennis at redhat.com>

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/