TallPaul TallPaul - 3 months ago 10
Python Question

lambda versus list comprehension performance

I recently posted a question using a lambda function and in a reply someone had mentioned lambda is going out of favor, to use list comprehensions instead. I am relatively new to Python. I ran a simple test:

import time

S=[x for x in range(1000000)]
T=[y**2 for y in range(300)]
#
#
time1 = time.time()
N=[x for x in S for y in T if x==y]
time2 = time.time()
print 'time diff [x for x in S for y in T if x==y]=', time2-time1
#print N
#
#
time1 = time.time()
N=filter(lambda x:x in S,T)
time2 = time.time()
print 'time diff filter(lambda x:x in S,T)=', time2-time1
#print N
#
#
#http://snipt.net/voyeg3r/python-intersect-lists/
time1 = time.time()
N = [val for val in S if val in T]
time2 = time.time()
print 'time diff [val for val in S if val in T]=', time2-time1
#print N
#
#
time1 = time.time()
N= list(set(S) & set(T))
time2 = time.time()
print 'time diff list(set(S) & set(T))=', time2-time1
#print N #the results will be unordered as compared to the other ways!!!
#
#
time1 = time.time()
N=[]
for x in S:
for y in T:
if x==y:
N.append(x)
time2 = time.time()
print 'time diff using traditional for loop', time2-time1
#print N


They all print the same N so I commented that print stmt out (except the last way it's unordered), but the resulting time differences were interesting over repeated tests as seen in this one example:

time diff [x for x in S for y in T if x==y]= 54.875
time diff filter(lambda x:x in S,T)= 0.391000032425
time diff [val for val in S if val in T]= 12.6089999676
time diff list(set(S) & set(T))= 0.125
time diff using traditional for loop 54.7970001698


So while I find list comprehensions on the whole easier to read, there seems to be some performance issues at least in this example.

So, two questions:


  1. Why is lambda etc being pushed aside?

  2. For the list comprehension ways, is there a more efficient implementation and how would you KNOW it's more efficient without testing? I mean, lambda/map/filter was supposed to be less efficient because of the extra function calls, but it seems to be MORE efficient.



Paul

Answer

Your tests are doing very different things. With S being 1M elements and T being 300:

[x for x in S for y in T if x==y]= 54.875

This option does 300M equality comparisons.

 

filter(lambda x:x in S,T)= 0.391000032425

This option does 300 linear searches through S.

 

[val for val in S if val in T]= 12.6089999676

This option does 1M linear searches through T.

 

list(set(S) & set(T))= 0.125

This option does two set constructions and one set intersection.


The differences in performance between these options is much more related to the algorithms each one is using, rather than any difference between list comprehensions and lambda.