OrangeFlash81 OrangeFlash81 - 1 year ago 62
Python Question

Why are Python 3.6 literal formatted strings so slow?

I've downloaded a Python 3.6 build from the Python Github repository, and one of my favourite new features is literal string formatting. It can be used like so:

>>> x = 2
>>> f"x is {x}"
"x is 2"

This appears to do the same thing as using the
function on a
instance. However, one thing that I've noticed is that this literal string formatting is actually very slow compared to just calling
. Here's what
says about each method:

>>> x = 2
>>> timeit.timeit(lambda: f"X is {x}")
>>> timeit.timeit(lambda: "X is {}".format(x))

If I use a string as
's argument, my results are still showing the pattern:

>>> timeit.timeit('x = 2; f"X is {x}"')
>>> timeit.timeit('x = 2; "X is {}".format(x)')

As you can see, using
takes almost half the time. I would expect the literal method to be faster because less syntax is involved. What is going on behind the scenes which causes the literal method to be so much slower?


The f"..." syntax is effectively converted to a str.join() operation on the literal string parts around the {...} expressions, and the results of the expressions themselves passed through the object.__format__() method (passing any :.. format specification in). You can see this when disassembling:

>>> import dis
>>> dis.dis(compile('f"X is {x}"', '', 'exec'))
  1           0 LOAD_CONST               0 ('')
              3 LOAD_ATTR                0 (join)
              6 LOAD_CONST               1 ('X is ')
              9 LOAD_NAME                1 (x)
             12 FORMAT_VALUE             0
             15 BUILD_LIST               2
             18 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             21 POP_TOP
             22 LOAD_CONST               2 (None)
             25 RETURN_VALUE
>>> dis.dis(compile('"X is {}".format(x)', '', 'exec'))
  1           0 LOAD_CONST               0 ('X is {}')
              3 LOAD_ATTR                0 (format)
              6 LOAD_NAME                1 (x)
              9 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             12 POP_TOP
             13 LOAD_CONST               1 (None)
             16 RETURN_VALUE

Note the BUILD_LIST and LOAD_ATTR .. (join) op-codes in that result. The new FORMAT_VALUE takes the top of the stack plus a format value (parsed out at compile time) to combine these in a object.__format__() call.

So your example, f"X is {x}", is translated to:

''.join(["X is ", x.__format__('')])

Note that this requires Python to create a list object, and call the str.join() method.

The str.format() call is also a method call, and after parsing there is still a call to x.__format__('') involved, but crucially, there is no list creation involved here. It is this difference that makes the str.format() method faster.

Note that Python 3.6 has only been released as an alpha build; this implementation can still easily change. See PEP 494 – Python 3.6 Release Schedule for the time table, as well as Python issue #27078 (opened in response to this question) for a discussion on how to further improve the performance of formatted string literals.