0xC0000022L 0xC0000022L - 3 months ago 20
Python Question

Having trouble to understand the output in a Jupyter notebook ("Learning Cython")

In the O'Reilly video tutorial "Learning Cython" in chapter 5, there's a notebook called

strings.ipynb
.

First cell loads the Cython extension:

%load_ext cython


Followed by this Cython cell:

%%cython
# cython: language_level=3

def f(char* text):
print(text)


Then the following cell is used to demonstrate that a (Unicode) string cannot be used as
char*
argument:

f('It is I, Arthur, son of Uther Pendragon')


The outcome here is a
TypeError
exception.

All of the above is what I'd expect from the author's remarks in the voice-over. However, the outcome of the next cell:

f(b'It is I, Arthur, son of Uther Pendragon')


was this:

b'It is I, Arthur, son of Uther Pendragon'


and that stumped me.

Having used a plain
print
in the function
f
, why does the output appear as if it was run through
repr
first, when inside the Cython code above it clearly was not run through
repr
?

The author doesn't even mention this somewhat (at least to me) unexpected result in the voice-over.

What gives? Why does the output look like it was first passed through
repr
? Are byte strings in Python 3 not "printable" (i.e. without
str
method) and therefore fall back to
repr
?

PS: I have to admit I'm coming from Python 2.x and haven't had too much exposure to Python 3.x, so perhaps the difference is therein.

Jim Jim
Answer

Because it was. In Python 3, bytes_str uses bytes_repr internally:

static PyObject *
bytes_str(PyObject *op)
{
    if (Py_BytesWarningFlag) {
        if (PyErr_WarnEx(PyExc_BytesWarning,
                         "str() on a bytes instance", 1))
            return NULL;
    }
    return bytes_repr(op);  // call repr on it
}

As such, print will, in essence, call repr(bytes_instance).

Comments