iulian iulian - 6 months ago 16
Python Question

Why is dict definition faster in Python 2.7 than in Python 3.x?

I have encountered a (not very unusual) situation in which I had to either use a

map()
or a list comprehension expression. And then I wondered which one is faster.

This StackOverflow answer provided me the solution, but then I started to test it myself. Basically the results were the same, but I found an unexpected behavior when switching to Python 3 that I got curious about, and namely:

λ iulian-pc ~ → python --version
Python 2.7.6
λ iulian-pc ~ → python3 --version
Python 3.4.3

λ iulian-pc ~ → python -mtimeit '{}'
10000000 loops, best of 3: 0.0306 usec per loop
λ iulian-pc ~ → python3 -mtimeit '{}'
10000000 loops, best of 3: 0.105 usec per loop

λ iulian-pc ~ → python -mtimeit 'dict()'
10000000 loops, best of 3: 0.103 usec per loop
λ iulian-pc ~ → python3 -mtimeit 'dict()'
10000000 loops, best of 3: 0.165 usec per loop


I had the assumption that Python 3 is faster than Python 2, but it turned out in several posts (1, 2) that it's not the case. Then I thought that maybe Python 3.5 will perform better at such a simple task, as they state in their
README
:


The language is mostly the same, but many details, especially how
built-in objects like dictionaries and strings work, have changed
considerably, and a lot of deprecated features have finally been
removed.


But nope, it performed even worse:

λ iulian-pc ~ → python3 --version
Python 3.5.0

λ iulian-pc ~ → python3 -mtimeit '{}'
10000000 loops, best of 3: 0.144 usec per loop
λ iulian-pc ~ → python3 -mtimeit 'dict()'
1000000 loops, best of 3: 0.217 usec per loop


I've tried to dive into the Python 3.5 source code for
dict
, but my knowledge of C language is not sufficient to find the answer myself (or, maybe I even don't search in the right place).

So, my question is:



What makes the newer version of Python slower comparing to an older version of Python on a relatively simple task such as a
dict
definition, as by the common sense it should be vice-versa? I'm aware of the fact that these differences are so small that in most cases they can be neglected. It was just an observation that made me curious about why the time increased and not remained the same at least?

Answer

Let's disassemble {}:

>>> from dis import dis
>>> dis(lambda: {})
  1           0 BUILD_MAP                0
              3 RETURN_VALUE

Python 2.7 implementation of BUILD_MAP

TARGET(BUILD_MAP)
{
    x = _PyDict_NewPresized((Py_ssize_t)oparg);
    PUSH(x);
    if (x != NULL) DISPATCH();
    break;
}

Python 3.5 implementation of BUILD_MAP

TARGET(BUILD_MAP) {
    int i;
    PyObject *map = _PyDict_NewPresized((Py_ssize_t)oparg);
    if (map == NULL)
        goto error;
    for (i = oparg; i > 0; i--) {
        int err;
        PyObject *key = PEEK(2*i);
        PyObject *value = PEEK(2*i - 1);
        err = PyDict_SetItem(map, key, value);
        if (err != 0) {
            Py_DECREF(map);
            goto error;
        }
    }

    while (oparg--) {
        Py_DECREF(POP());
        Py_DECREF(POP());
    }
    PUSH(map);
    DISPATCH();
}

It's little bit more code.