zeo2396 - 1 year ago 154
Python Question

# N/3 Repeat Number. what do you think of my solution?

I'm going over a coding problem and was able to solve it (see solution below). Although the solution passes the test cases, I was wondering if you could take a look and let me know if it addresses the questions and how you would approach it. Thank you.

Youâ€™re given a read only array of n integers. Find out if any integer
occurs more than n/3 times in the array in linear time and constant

If so, return the integer. If not, return -1.

If there are multiple solutions, return any one.

Example :

Input : [1 2 3 1 1] Output : 1 #1 occurs 3 times which is more than
5/3 times.

Here's my solution in Python

``````class Solution:
# @param A : tuple of integers
# @return an integer
def repeatedNumber(self, A):
tracking = dict()

for i in A:
if i in tracking:
tracking[i] += 1
else:
tracking[i] = 1
#print tracking
results =  []
for k in tracking:
if tracking[k] > len(A)/3:
results.append(k)
#print results
return -1 if len(results) < 1 else results[0]
``````

Your algorithm is correct, but uses more than O(1) space, so it does not fit with the requirements.

This problem is a variation on the majority vote problem, where the occurrence factor must be more than n/2 instead of n/3.

For the n/2 case the Boyerâ€“Moore majority vote algorithm can be used, so I have been looking into finding a variation of this algorithm that would work for the n/3 case.

The counter which is used in Boyer-Moore would have to remain positive for a longer time to ensure a frequent value "survives" long enough to be recognised as a solution. This can be achieved by incrementing the counter with 2 when the value matches the currently retained value, and decrementing it by 1 (like in Boyer-More) when it differs from it.

Although that looks promising, it will fail identifying candidates that occur while the counter is decrementing, since the number of decrements can be 2n/3. The solution I propose to prevent this from happening is to use several counters, which work almost independently: they each track a different value. When one of them drops to zero, and the current value is not one of the other values being tracked, the zero-counter is bound to this new value, and starts incrementing again (with 2 at a time).

I found that with 4 such parallel counters -- and corresponding tracked values -- the algorithm never misses a good candidate.

When the end of the array is reached, the algorithm will have 4 candidates (if the array has at least 4 elements of course). The second sweep which is done in Boyer-More, is then also performed here, but by counting the occurrences of each of the 4 candidates. If any of those values occurs more than n/3 times, the solution has been found. If not, it is concluded there is no such value in the array.

Compared with the Boyer-More algorithm, this variation of it uses 4 times more time and 4 times more space, but that does not change the complexity: it still spends O(n) time and uses O(1) space.

Here is the code:

``````def commonNumber(a):
counts = [0, 0, 0, 0]
values = [None, None, None, None]
for value in a:
is_not_tracked = value not in values
for i, count in enumerate(counts):
if count == 0 and is_not_tracked:
values[i] = value
counts[i] += 2 if value == values[i] else -1 if count else 0
# second phase: count the occurrences for each of the 4 candidates
for value in values:
if a.count(value) > len(a) // 3:
return value
return None
``````

I tested this code with many arrays having a solution. Such arrays can be built by having -1 occurring just the minimum number of times needed to be considered a solution, then -2 occurring one time less, the value -3 once, and then a list of non-negative random values to fill up the remainder. That remainder list has fewer elements than needed for a value to be a solution, so such arrays always have -1 as solution.

Here is one of the code blocks I have tested with:

``````import random
size = 100
repeat = size//3 +1
a = [-1]*repeat + [-2]*(repeat-1) + [-3] + [int(i*random.random()) for i in range(size-2*repeat)]
for i in range(1000):
random.shuffle(a)
result = commonNumber(a)
# Verify that the outcome is as expected: it should be value -1
if result != -1:
print('fail:', a, result)
print('done')
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download