[Pki-devel] PATCH 005] Replace legacy Python base64 invocations with Py3-safe code

Endi Sukma Dewata edewata at redhat.com
Thu Sep 24 16:40:53 UTC 2015


On 9/21/2015 8:14 AM, Christian Heimes wrote:
> On 2015-08-26 20:13, Endi Sukma Dewata wrote:
>> As discussed on IRC, in b64encode() there's a code that converts Unicode
>> string data into ASCII:
>>
>>    if isinstance(data, six.text_type):
>>        data = data.encode('ascii')
>>
>> This conversion will not work if the string contains non-ASCII
>> characters, which limits the usage of this method.
>>
>> It's not that Python 3's base64.b64encode() doesn't support ASCII text
>> as noted in the method description, but it cannot encode Unicode string
>> because Unicode doesn't have a binary representation unless it's encoded
>> first.
>>
>> I think in this case the proper encoding for Unicode is UTF-8. So the
>> line should be changed to:
>>
>>    if isinstance(data, six.text_type):
>>        data = data.encode('utf-8')
>>
>> In b64decode(), the incoming data is a Unicode string containing the
>> base-64 encoding characters which are all ASCII, so data.encode('ascii')
>> will work, but to be more consistent it can also use data.encode('utf-8').
>
> We discussed the ticket a couple of weeks ago on IRC. The function is
> deliberately limited to ASCII only text in order to avoid encoding hell.
> Python 3 tries to avoid encoding bugs by removing implicit encoding of
> text and decoding of bytes.
>
> The special treatment is only required for encoding/decoding X.509 data
> in JSON strings for Python 3. Since it's a special case I changed the
> patch. The additional two functions are now called decode_cert() and
> encode_cert(). The functions are only used for X.509 PEM <-> DER in JSON.
>
> Christian
>

ACK.

-- 
Endi S. Dewata




More information about the Pki-devel mailing list