[Freeipa-devel] [PATCH] 971 detect binary LDAP data

Rob Crittenden rcritten at redhat.com
Wed Feb 29 14:45:03 UTC 2012


Jan Cholasta wrote:
> On 28.2.2012 18:58, Rob Crittenden wrote:
>> Jan Cholasta wrote:
>>> On 28.2.2012 18:02, Petr Viktorin wrote:
>>>> On 02/28/2012 04:45 PM, Rob Crittenden wrote:
>>>>> Petr Viktorin wrote:
>>>>>> On 02/28/2012 04:02 AM, Rob Crittenden wrote:
>>>>>>> Petr Viktorin wrote:
>>>>>>>> On 02/27/2012 05:10 PM, Rob Crittenden wrote:
>>>>>>>>> Rob Crittenden wrote:
>>>>>>>>>> Simo Sorce wrote:
>>>>>>>>>>> On Mon, 2012-02-27 at 09:44 -0500, Rob Crittenden wrote:
>>>>>>>>>>>> We are pretty trusting that the data coming out of LDAP matches
>>>>>>>>>>>> its
>>>>>>>>>>>> schema but it is possible to stuff non-printable characters
>>>>>>>>>>>> into
>>>>>>>>>>>> most
>>>>>>>>>>>> attributes.
>>>>>>>>>>>>
>>>>>>>>>>>> I've added a sanity checker to keep a value as a python str
>>>>>>>>>>>> type
>>>>>>>>>>>> (treated as binary internally). This will result in a base64
>>>>>>>>>>>> encoded
>>>>>>>>>>>> blob be returned to the client.
>>>
>>> I don't like the idea of having arbitrary binary data where unicode
>>> strings are expected. It might cause some unexpected errors (I have a
>>> feeling that --addattr and/or --delattr and possibly some plugins might
>>> not handle this very well). Wouldn't it be better to just throw away the
>>> value if it's invalid and warn the user?
>>
>> This isn't for user input, it is for data stored in LDAP. User's are
>> going to have no way to provide binary data to us unless they use the
>> API themselves in which case they have to follow our rules.
>
> Well my point was that --addattr and --delattr cause an LDAP search for
> the given attribute and plugins might get the result of a LDAP search in
> their post_callback and I'm not sure if they can cope with binary data.

It wouldn't be any different than if we had the value as a unicode.

We treat the python type str as binary data. Anything that is a str gets 
based64 encoded before json or xml-rpc transmission.

The type unicode is considered a "string" and goes in the clear.

We determine what this type should be not from the data but from the 
schema. This is a big assumption. Hopefully this answer's Petr's point 
as well.

We decided long ago that str means Binary and unicode means String. It 
is a bit clumsy perhaps python handles it well. It will be more clear 
when we switch to Python 3.0 and we'll have bytes and str instead as types.

>> We are trusting that the data in LDAP matches its schema. This is just
>> belt and suspenders verifying that it is the case.
>
> Sure, but I still think we should allow any valid unicode data to come
> from LDAP, not just what is valid in XML-RPC.

This won't affect data coming out of LDAP, only the way it is 
transmitted to the client.

rob




More information about the Freeipa-devel mailing list