KgOfHedgehogs KgOfHedgehogs - 1 month ago 12
Python Question

Comparing list comprehensions and explicit loops (3 array generators faster than 1 for loop)

I did homework and I accidentally found a strange inconsistency in the speed of the algorithm.
Here is 2 versions of code of same function bur with 1 difference: in first version i use 3 times array generator to filter some array and in second version i use 1 for loop with 3 if statements to do same filter work.

So, here is code of 1st version:

def kth_order_statistic(array, k):
pivot = (array[0] + array[len(array) - 1]) // 2
l = [x for x in array if x < pivot]
m = [x for x in array if x == pivot]
r = [x for x in array if x > pivot]
if k <= len(l):
return kth_order_statistic(l, k)
elif k > len(l) + len(m):
return kth_order_statistic(r, k - len(l) - len(m))
else:
return m[0]


And here code of 2nd version:

def kth_order_statistic2(array, k):
pivot = (array[0] + array[len(array) - 1]) // 2
l = []
m = []
r = []
for x in array:
if x < pivot:
l.append(x)
elif x > pivot:
r.append(x)
else:
m.append(x)

if k <= len(l):
return kth_order_statistic2(l, k)
elif k > len(l) + len(m):
return kth_order_statistic2(r, k - len(l) - len(m))
else:
return m[0]


IPython output for 1st version:

In [4]: %%timeit
...: A = range(100000)
...: shuffle(A)
...: k = randint(1, len(A)-1)
...: order_statisctic(A, k)
...:
10 loops, best of 3: 120 ms per loop


And for 2nd version:

In [5]: %%timeit
...: A = range(100000)
...: shuffle(A)
...: k = randint(1, len(A)-1)
...: kth_order_statistic2(A, k)
...:
10 loops, best of 3: 169 ms per loop


So why first version is faster than second? I also made third version wich using filter() function instead of array generator and it was slower than second version (it got 218 ms per loop)

Answer

Let's define the functions we will need to answer the question and timeit them:

enter image description here

we can see that the for loops without the append command is as fast as the list comprehension. In fact, if we have a look at the bytecode we can see that in the case of the list comprehension python is able to use a built-in bytecode command called LIST_APPEND instead of:

  • Load the list: 40 LOAD_FAST
  • Load the attribute: 43 LOAD_ATTRIBUTE
  • Call the loaded function: 49 CALL_FUNCTION
  • Unload the list(?): 52 POP_TOP

As you can see from the picture below the previous bytecode are missing with list comprehension and with the "loop_fast" function. Comparing the timeit of the three function is clear that those are responsible for the different timing of the three methods.

enter image description here

enter image description here

enter image description here