dysfunctional dysfunctional - 1 month ago 6
Python Question

Most Pythonic way to find/check items in a list with O(1) complexity?

The problem I'm facing is finding/checking items in a list with O(1) complexity. The following has a complexity of O(n):

'foo' in list_bar


This has a complexity of O(n) because you are using the
in
keyword on a
list
. (Refer to Python Time Complexity)

However, if you use the
in
keyword on a
set
, it has a complexity of O(1).

The reason why I need to figure out O(1) complexity for a list, and not a set, is largely due to the need to account for duplicate items within the list. Sets do not allow for duplicates. A decent example would be :

chars_available = ['h', 'e', 'l', 'o', 'o', 'z']
chars_needed = ['h', 'e', 'l', 'l', 'o']

def foo(chars_available, chars_needed):
cpy_needed = list(chars_needed)
for char in cpy_needed:
if char in chars_available:
chars_available.remove(char)
chars_needed.remove(char)
if not chars_needed: return True # if chars_needed == []
return False

foo(chars_available, chars_needed)


The example is not the focus here, so please try not to get sidetracked by it. The focus is still trying to get O(1) complexity for finding items in a list. How would I accomplish that pythonically?

(As extra credit, if you did want to show a better way of performing that operation in Python, pseudocode, or another language, I'd be happy to read it).

Thank you!

Edit:

In response to Ami Tavory's answer, I learned you can't make lists faster than O(n), but the suggestion for
collections.Counter()
helped solve the application I was working on. I'm uploading my faster solution for Stack Overflow, the performance was phenomenal! If I'm not mistaken (correct me if I'm wrong), it should be O(1) since it involves only hashable values and no loop iteration.

from collections import Counter
chars_available = ['h', 'e', 'l', 'o', 'o', 'z']
chars_needed = ['h', 'e', 'l', 'l', 'o']

def foo(chars_available, chars_needed):
counter_available = Counter(chars_available)
counter_needed = Counter(chars_needed)
out = counter_needed - counter_available
if not list(out.elements()): return True
else: return False

foo(chars_available, chars_needed)


Very fast, very pythonic! Thanks!

Answer

In general, it's impossible to find elements in a list in constant time. You could hypothetically maintain both a list and a set, but updating operations will take linear time.

You mention that your motivation is

a list, and not a set, is largely due to the need to account for duplicate items within the list. Sets do not allow for duplicates.

and ask not to focus on the example. If this is your motivation, you might want to use instead of a set, a dict mapping each element to the number of its occurrences.

You might find collections.Counter useful in particular:

In [1]: from collections import Counter

In [2]: Counter(['h', 'e', 'l', 'o', 'o', 'z'])
Out[2]: Counter({'e': 1, 'h': 1, 'l': 1, 'o': 2, 'z': 1})