light2yellow - 2 months ago 5x
Python Question

# Efficient way to compare two lists remembering the origin for every unique element

Given this sample, which finds unique elements while being available to tell the element source

``````source_list = ["one", "two", "three", "four", "five"]
diff_list = ["zero", "one", "two", "three", "four", "six", "seven"]

source_unique = []
diff_unique = []

for entry in source_list:
if entry not in diff_list:
source_unique.append(entry)

for entry in diff_list:
if entry not in source_list:
diff_unique.append(entry)

print("Unique elements in source_list: {0}".format(source_unique))
print("Unique elements in diff_list: {0}".format(diff_unique))

###
# Unique elements in source_list: ['five']
# Unique elements in diff_list: ['zero', 'six', 'seven']
``````

is there a more efficient way to do this instead of using two additional lists and all that stuff? The main task is to be able to tell the elements' origin.

By using `set`s and taking their `difference` which has a complexity of `O(len(set_value))`:

``````>>> s1, s2 = set(source_list), set(diff_list)
>>> s1.difference(s2)
{'five'}
>>> s2.difference(s1)
{'seven', 'six', 'zero'}
``````

in this case, you might need to transform to a `list` afterwards, if necessary.

Or, you could do the same thing by using a `list` comprehension and making `source_list` and `diff_list` sets for fast membership testing:

For the `uniques` list:

``````source_unique = [v1 for v1 in source_list if v1 not in set(diff_list)]
source_unique
['five']
``````

For the `diff_unique` list:

``````diff_unique = [v1 for v1 in diff_list if v1 not in set(source_list)]
diff_unique
['zero', 'six', 'seven']
``````

Which is again `O(len(list))` unless I'm mistaking my time complexities.