Hunter Corry Hunter Corry - 2 months ago 7
Python Question

Efficient way to check if the last element in a row (in a list of lists) is found in another list?

I currently have a list of lists (let's name it "big") that is about 9 columns and 5000 rows and growing. I have another list (let's name this one "small") that has approximately 3000 elements. My goal is to return each row in big where big[8] can be found in small. The results will be stored in a list of lists.

I have used list comprehension which has been returning the proper output, but it is far too inefficient for my needs. It takes several seconds to process these 5000 rows (usually about 6.5 seconds, and its efficiency gets worse with larger lists), and it needs to be able to quickly handle hundreds of thousands of rows.

The list comprehension I wrote is:

results = [row for row in big if row[8] in small]


Sample data of list of lists (big):

[[23.4, 6.8, 9.0, 13.0, 4.0, 19.0, 2.5, 7.6, 1472709600000],
[32.1, 15.5, 17.7, 21.7, 12.7, 27.7, 11.2, 16.3, 1472882400000],
[40.8, 24.2, 26.4, 30.4, 21.4, 36.4, 19.9, 25.0, 1473055200000],
[49.5, 32.9, 35.1, 39.1, 30.1, 45.1, 28.6, 33.7, 1473228000000],
[58.2, 41.6, 43.8, 47.8, 38.8, 53.8, 37.3, 42.4, 1473400800000]]


Sample data of list (small):

[1472709600000, 1473055200000]


Desired output (results):

[[23.4, 6.8, 9.0, 13.0, 4.0, 19.0, 2.5, 7.6, 1472709600000],
[40.8, 24.2, 26.4, 30.4, 21.4, 36.4, 19.9, 25.0, 1473055200000]]


Is there a more efficient way to return each row that has its last element found in another list?

Answer

You can easily eliminate the linear search of small on each iteration by using a set:

smallset = set(small)
results = [row for row in big if row[8] in smallset]