zeo2396 zeo2396 - 5 days ago 5
Python Question

N/3 Repeat Number. what do you think of my solution?

I'm going over a coding problem and was able to solve it (see solution below). Although the solution passes the test cases, I was wondering if you could take a look and let me know if it addresses the questions and how you would approach it. Thank you.

You’re given a read only array of n integers. Find out if any integer
occurs more than n/3 times in the array in linear time and constant
additional space.

If so, return the integer. If not, return -1.

If there are multiple solutions, return any one.

Example :

Input : [1 2 3 1 1] Output : 1 #1 occurs 3 times which is more than
5/3 times.

Here's my solution in Python

class Solution:
# @param A : tuple of integers
# @return an integer
def repeatedNumber(self, A):
tracking = dict()

for i in A:
if i in tracking:
tracking[i] += 1
tracking[i] = 1
#print tracking
results = []
for k in tracking:
if tracking[k] > len(A)/3:
#print results
return -1 if len(results) < 1 else results[0]


Your algorithm is correct, but uses more than O(1) space, so it does not fit with the requirements.

This problem is a variation on the majority vote problem, where the occurrence factor must be more than n/2 instead of n/3.

For the n/2 case the Boyer–Moore majority vote algorithm can be used, so I have been looking into finding a variation of this algorithm that would work for the n/3 case.

The counter which is used in Boyer-Moore would have to remain positive for a longer time to ensure a frequent value "survives" long enough to be recognised as a solution. This can be achieved by incrementing the counter with 2 when the value matches the currently retained value, and decrementing it by 1 (like in Boyer-More) when it differs from it.

Although that looks promising, it will fail identifying candidates that occur while the counter is decrementing, since the number of decrements can be 2n/3. The solution I propose to prevent this from happening is to use several counters, which work almost independently: they each track a different value. When one of them drops to zero, and the current value is not one of the other values being tracked, the zero-counter is bound to this new value, and starts incrementing again (with 2 at a time).

I found that with 4 such parallel counters -- and corresponding tracked values -- the algorithm never misses a good candidate.

When the end of the array is reached, the algorithm will have 4 candidates (if the array has at least 4 elements of course). The second sweep which is done in Boyer-More, is then also performed here, but by counting the occurrences of each of the 4 candidates. If any of those values occurs more than n/3 times, the solution has been found. If not, it is concluded there is no such value in the array.

Compared with the Boyer-More algorithm, this variation of it uses 4 times more time and 4 times more space, but that does not change the complexity: it still spends O(n) time and uses O(1) space.

Here is the code:

def commonNumber(a):
    counts = [0, 0, 0, 0]
    values = [None, None, None, None]
    for value in a:
        is_not_tracked = value not in values
        for i, count in enumerate(counts):
            if count == 0 and is_not_tracked:
                values[i] = value
            counts[i] += 2 if value == values[i] else -1 if count else 0
    # second phase: count the occurrences for each of the 4 candidates
    for value in values:
        if a.count(value) > len(a) // 3:
            return value
    return None

I tested this code with many arrays having a solution. Such arrays can be built by having -1 occurring just the minimum number of times needed to be considered a solution, then -2 occurring one time less, the value -3 once, and then a list of non-negative random values to fill up the remainder. That remainder list has fewer elements than needed for a value to be a solution, so such arrays always have -1 as solution.

Here is one of the code blocks I have tested with:

import random
size = 100
repeat = size//3 +1
a = [-1]*repeat + [-2]*(repeat-1) + [-3] + [int(i*random.random()) for i in range(size-2*repeat)]
for i in range(1000):
    result = commonNumber(a)
    # Verify that the outcome is as expected: it should be value -1
    if result != -1: 
        print('fail:', a, result)