[Freeipa-devel] DN patch and documentation

Fri Jul 27 15:18:31 UTC 2012

The nitpicks continue. None of the things in this mail are holding up 
the patch, consider it more of a discussion about good practices. (I'm 
assuming you're interested.)

ipaserver/rpcserver.py:
      elif isinstance(val, (list, tuple)):
-        new_list = []
-        n = len(val)
-        i = 0
-        while i < n:
-            v = val[i]
-            if isinstance(v, str):
-                new_list.append({'__base64__' : base64.b64encode(v)})
-            else:
-                new_list.append(json_encode_binary(v))
-            i += 1
+        new_list = [json_encode_binary(v) for v in val]
          del val
          return new_list

You might as well remove the `del val`s.

ipapython/dn.py:
-Concatenation and In-Place Addition:
+Concatenation, In-Place Addition, Insetion:

s/Insetion/Insertion/

In the giant comment in ipapython/dn.py:

+by reference" (mutable objects can be modifed inside the
unbalanced parenthesis

+and B are the same object Python says no [2]_. For those familiar with Lisp
missing reference

Your example here only applies to interactive sessions of current 
versions of CPython, i.e. the Python implementation we use; other 
Pythons are free to use same or different objects for equal strs/ints as 
they choose. Even in CPython literals happen to become the same object 
if they appear in the same compilation unit:
     >>> 'abc' == 'abc'
     True
But in general I think lengthy explanations of Python's behavior shuld 
go in a Python tutorial. The docstring should say what the code does and 
how to use it, and comments can summarize design/implementation 
decisions with simple justification (possibly with links to 
tutorials/documentation -- your prose could make good blog posts).
By the way, are you aware that the text in dn.py is not a docstring, 
because it's not the first statement in the file?

+verses using a hash value for dict keys or sets is critical to
s/verses/versus/

Good job on the DN tests!

On 07/26/2012 09:59 PM, John Dennis wrote:
> On 07/11/2012 03:59 AM, Petr Viktorin wrote:
>> I went over the changes to ipaserver/plugins/ldap2.py; now I can start
>> the testing.
>>
>> I must ask about the wrapped _ext methods. I assume python-ldap exposes
>> them only to mimic the C interface, which has them because C doesn't
>> have default arguments. Would we lose anything if the class only defined
>> the _ext methods, and aliased the non-ext variants to them? e.g.
>>
>>       def add_ext(self, dn, modlist, serverctrls=None, clientctrls=None):
>>           assert isinstance(dn, DN)
>>           dn = str(dn)
>>           modlist = self.encode(modlist)
>>           return self.conn.add_ext(dn, modlist, serverctrls, clientctrls)
>>       add = add_ext
>>
>> The non-_ext variants are deprecated in the underlying C library, anyway.
>
>
> With the current implementation you are correct. There would be no
> functional difference if we only utilized the _ext variants. However
> that means employing knowledge from the wrong side of an API boundary.
>  From our perspective python-ldap exposes certain methods. In particular
> we use the simpler methods because that is sufficient to meet our needs.
> I think we should code to the defined API. If python-ldap ever changes
> or we use an alternate Python ldap library the porting effort would be
> much cleaner if we don't try to outsmart ourselves.

Fair enough.

>> Lines 203, 825:
>>       os.environ['KRB5CCNAME'] = ccache_file
>> Should KRB5CCNAME be unset/restored once the code's done with it.
>
> Agreed, once again I didn't write this but saw the problem when I moved
> the code block and made a note to fix it in the future. This particular
> place is fixed now. However I think we've got other places that do
> similar things. This is another example of the tendency to cut-n-paste
> blocks of duplicate code which drives me nuts. But how much unrelated
> code does one change in any patch? I did however add a FIXME comment to
> get that code cleaned up. In the past I would change any problem I saw
> when I touched an area of the code but I got a lot of complaints about
> changing things not directly related to the intent of the patch :-(

Also fair. Sorry for different standards, I know it's frustrating to 
have the line drawn in a different place each time.

>> Line 107 & 127:
>>       return utf8_codec.decode(val)[0]
>>       return utf8_codec.encode(unicode(val))[0]
>> I believe the following would be more readable:
>>       return val.decode('utf-8')
>>       return unicode(val).encode('utf-8')
>
> Yes, your suggestion is easier to read, but it's more efficient to
> lookup the codec once rather than everytime you decode.

I wouldn't worry about that; lookups are pretty fast -- I'd say 
comparable to creating an unnecessary tuple. Of course it's hard to know 
without profiling.

>> Line 354:
>>       _SCHEMA_OVERRIDE = {
>> We have a case-insensitive dict class, why not use it here?
>
> Excellent suggestion, it's been changed.

Please also change the use:
         attr_lower = attr.lower()
         syntax = self._SCHEMA_OVERRIDE.get(attr_lower)
to
         syntax = self._SCHEMA_OVERRIDE[attr]

>> Lines 388, 750:
>>       def is_dn_syntax(self, attr):
>>           """
>>           Check the schema to see if the attribute uses DN syntax.
>> uses_dn_syntax or has_dn_syntax would be a better name.
>
> Good suggestion. I made the change.
>
> Actually, I wish LDAP attributes were a class, that way the correct
> comparisons could be automatically applied by Python based on the LDAP
> syntax of the particular attribute and things like "is_dn" and
> "is_multivalued" could be a properties of the attribute instance, etc.
> We are using an object oriented language aren't we?

All problems can be solved by another level of abstraction.
(except for the problem of too many layers of abstraction)

Sounds like a ticket for the deferred bucket :)

>> Line 406:
>>           if isinstance(val, bool):
>>               if val:
>>                   val = 'TRUE'
>>               else:
>>                   val = 'FALSE'
>>               return self.encode(val)
>> Encoding an ASCII string should be a no-op, why bother doing it?
>
> Not sure what you mean by no-op. Do you mean str(val) would return
> either 'True' or 'False'?
>
> In any event we are dependent on the string representation of a boolean
> being all uppper-case. Not something I invented.

I meant the `return self.encode(val)`, with val being either 'TRUE' or 
'FALSE'. The following should suffice:

            if isinstance(val, bool):
                if val:
                    return 'TRUE'
                else:
                    return 'FALSE'

Sorry for not being clear.

>> Line 421:
>>       dct = dict()
>>       for (k, v) in val.iteritems():
>>           dct[self.encode(k)] = self.encode(v)
>>       return dct
>> Consider:
>>       dct = dict((self.encode(k), self.encode(v)) for k, v in
>> val.iteritems())
>
> Why do the work of building a list of tuples when it's not necessary?

It's a generator expression, it doesn't build a list.

>> Line 437:
>>       if type(target_type) is type and isinstance(original_value,
>> target_type):
>> Don't use `type(x) is expected_type` for type checking! Use
>> isinstance(target_type, type) so you also catch type's subclasses.
>
> Good point, fixed.
>
>    Or
>> better yet, handle the TypeError that isinstance will raise if you don't
>> give it a class.
>> To be honest, I don't think it's worth it to do all this convoluted
>> checking just to make sure you're not copying unnecessarily. Do you have
>> a reason I can't see?
>
> Because some constructors are expensive and it's good to pay attention
> to efficiency.

It is, but I'm not convinced this is the right place. Complicating the 
code (especially in a fairly obscure method with no dosctring) has its 
own drawbacks.

If constructors are too expensive it might be better to optimize those 
-- for example sharing immutable AVAs/RDNs when copying DNs would 
probably give a similar speedup here, as well as helping other parts of 
the code.

>> Line 594, 604, 613:
>>           ipa_result = self.convert_result(ldap_result)
>>           return ipa_result
>> Just return the result directly, no need for the extra variable.
>
> It was done that way for two reasons. During development it gave me a
> value to print for debugging purposes and I thought it made the code
> clearer by indicating what is an IPA value. I don't think the extra
> variable introduces any significant inefficiencies (in C it would be
> optimized out, not sure Python's byte compiler does the same).

It doesn't, but the slowdown is negligible.
Never mind, this is an extremely minor point.

>> Line 1422:
>>       self.handle_errors(e, **{})
>> Wouldn't the following work better?
>>       self.handle_errors(e, *[{}]*0)
>
> I never liked the way handle_errors was written. The use of the kwds
> args is odd, convoluted and hard to understand.
>
> To be honest I can't parse
>
> *[{}]*0

Sorry, lame attempt at humor.
The **{} doesn't do anything at all, you can remove it.
The keyword arguments in handle_errors are unused, so you can remove 
that as well if you want.

-- 
Petr³