user1362058 - 22 days ago 7
Python Question

# Efficiently check if one string contain all character of another

I have two strings: one a word and one a scramble of letters. I want to see if this scramble of letters holds enough letters to spell the word. I have come up with an algorithm to do this but it isn't efficient enough and I was hoping I could get some help making it faster.

Here's what I have so far:

``````s1 = 'hypochondriac'
s2 = 'yqhpwoewnqlchpijcdrxpoa'

temp = list(s1)
for X in s2:
for Y in temp:
if X == Y:
temp.remove(X)
X = '@'
if temp == []:
print('Found ', s1)
``````

I have an issue where once X matches I need to increment X but I didn't know how so I just take it out of the equation by making it the at symbol. I tried using break but it doesn't reach far enough to break the to the s2 loop. Either way, I'm pretty sure this double for loop idea is super slow compared to what someone with some experience would use. Any ideas?

Your code is not efficient, no, because you iterate in a double loop. For each letter in `s1`, in the worst-case scenario (no matches) you loop over all of `s2`.

Use a `Counter` object instead; these act as multi-sets, where you can both test if a character is present in O(1) time and manage remaining counts:

``````from collections import Counter

def contains(s1, s2):
s2set = Counter(s2)
for c in s1:
count = s2set[c]
if not c:
return False
if count == 1:
del s2set[c]
else:
s2set[c] = count - 1
return True
``````

You can also turn `s1` into a multi-set, and check if the multi-set for `s2` contains enough letters for each entry:

``````def contains(s1, s2):
s1set = Counter(s1)
s2set = Counter(s2)
for c, count in s1set.items():
if count > s2set[c]:
return False
return True
``````

The latter can be reduced further with the `all()` function, which returns `False` early if any of the results it is passed is `False`, `True` otherwise:

``````def contains(s1, s2):
s2set = Counter(s2)
return all(count <= s2set[c] for c, count in Counter(s1).items())
``````

In all these you only have to iterate over both `s1` and `s2` once (either directly or to produce the multi-set).

Demo of the latter:

``````>>> from collections import Counter
>>> def contains(s1, s2):
...     s2set = Counter(s2)
...     return all(count <= s2set[c] for c, count in Counter(s1).items())
...
>>> s1 = 'hypochondriac'
>>> s2 = 'yqhpwoewnqlchpijcdrxpoa'
>>> contains(s1, s2)
True
>>> contains(s1 + 'b', s2)
False
``````