sparsity - 1 month ago 3x
Python Question

# Implementing an algorithm to determine if a string has all unique characters

Context: I'm a CS n00b working my way through "Cracking the Coding Interview." The first problem asks to "implement an algorithm to determine if a string has all unique characters." My (likely naive) implementation is as follows:

``````def isUniqueChars2(string):
uchars = []
for c in string:
if c in uchars:
return False
else:
uchars.append(c)
return True
``````

The author suggests the following implementation:

``````def isUniqueChars(string):
checker = 0
for c in string:
val = ord(c) - ord('a')
if (checker & (1 << val) > 0):
return False
else:
checker |= (1 << val)
return True
``````

What makes the author's implementation better than mine (FWIW, the author's solution was in Java and I converted it to Python -- is my solution one that is not possible to implement in Java)? Or, more generally, what is desirable in a solution to this problem? What is wrong with the approach I've taken? I'm assuming there are some fundamental CS concepts (that I'm not familiar with) that are important and help inform the choice of which approach to take to this problem.

Here is how I would write this:

``````def unique(s):
return len(set(s)) == len(s)
``````

Strings are iterable so you can pass your argument directly to `set()` to get a set of the characters from the string (which by definition will not contain any duplicates). If the length of that set is the same as the length of the original string then you have entirely unique characters.

Your current approach is fine and in my opinion it is much more Pythonic and readable than the version proposed by the author, but you should change `uchars` to be a set instead of a list. Sets have O(1) membership test so `c in uchars` will be considerably faster on average if `uchars` is a set rather than a list. So your code could be written as follows:

``````def unique(s):
uchars = set()
for c in s:
if c in uchars:
return False