user1801810 user1801810 - 4 months ago 11
Python Question

List comprehension with list and list of tuples

In my Python 2.7.5 code I have the following data structures:

A simple list...

>>> data["parts"]
['com', 'google', 'www']


...and a list of tuples...

>>> data["glue"]
[(1L, 'com'), (3L, 'google')]


When entering the code where these structures exist I will always know what is in
data["parts"]
;
data["glue"]
, at best, will contain "matching" tuples with what is in
data["parts"]
- worst case
data["glue"]
can be empty. What I need is to know is the parts that are missing from glue. So with the example data above, I need to know that 'www' is missing, meaning it is not in any of the tuples that may exist in
data["glue"]
.

I first tried to produce a list of the missing pieces by way of various for loops coupled with if statements but it was very messy at best. I have tried list comprehensions and failed. Maybe list comprehension is not the way to handle this either.

Your help is much appreciated, thanks.

Answer

You can use set difference operations.

print set(data['parts'])-set(i[1] for i in data['glue'])
>>> set(['www'])

or with simply using list comprehensions:

print [i for i in data['parts'] if i not in (j[1] for j in data['glue'])]
>>> ['www']

The set operation wins in the speed department, running the operation 10,000,000 times, we can see that the list comprehension takes over 16s longer:

import timeit
print timeit.timeit(lambda : set(data['parts'])-set(i[1] for i in data['glue']), number=10000000)
>>> 16.8089739356
print timeit.timeit(lambda : [i for i in data['parts'] if i not in (j[1] for j in data['glue'])], number=10000000)
>>> 33.5426096522