Re: [Pki-devel] PATCH 005] Replace legacy Python base64 invocations with Py3-safe code

Thursday, 24 September 2015

On 9/21/2015 8:14 AM, Christian Heimes wrote:
...
 On 2015-08-26 20:13, Endi Sukma Dewata wrote:
> As discussed on IRC, in b64encode() there's a code that converts Unicode
> string data into ASCII:
>
>    if isinstance(data, six.text_type):
>        data = data.encode('ascii')
>
> This conversion will not work if the string contains non-ASCII
> characters, which limits the usage of this method.
>
> It's not that Python 3's base64.b64encode() doesn't support ASCII text
> as noted in the method description, but it cannot encode Unicode string
> because Unicode doesn't have a binary representation unless it's encoded
> first.
>
> I think in this case the proper encoding for Unicode is UTF-8. So the
> line should be changed to:
>
>    if isinstance(data, six.text_type):
>        data = data.encode('utf-8')
>
> In b64decode(), the incoming data is a Unicode string containing the
> base-64 encoding characters which are all ASCII, so data.encode('ascii')
> will work, but to be more consistent it can also use data.encode('utf-8').

 We discussed the ticket a couple of weeks ago on IRC. The function is
 deliberately limited to ASCII only text in order to avoid encoding hell.
 Python 3 tries to avoid encoding bugs by removing implicit encoding of
 text and decoding of bytes.

 The special treatment is only required for encoding/decoding X.509 data
 in JSON strings for Python 3. Since it's a special case I changed the
 patch. The additional two functions are now called decode_cert() and
 encode_cert(). The functions are only used for X.509 PEM <-> DER in JSON.

 Christian

ACK.

-- 
Endi S. Dewata

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

Re: [Pki-devel] PATCH 005] Replace legacy Python base64 invocations with Py3-safe code