zeo2396 - 1 year ago 106

Python Question

I'm going over the following coding problem:

Youâ€™re given a read only array of n integers. Find out if any integer

occurs more than n/3 times in the array in linear time and constant

additional space.

If so, return the integer. If not, return -1.

If there are multiple solutions, return any one.

Example :

Input : [1 2 3 1 1]

Output : 1

1 occurs 3 times which is more than 5/3 times.

I was able to solve it with this Python code:

`class Solution:`

# @param A : tuple of integers

# @return an integer

def repeatedNumber(self, A):

tracking = dict()

for i in A:

if i in tracking:

tracking[i] += 1

else:

tracking[i] = 1

#print tracking

results = []

for k in tracking:

if tracking[k] > len(A)/3:

results.append(k)

#print results

return -1 if len(results) < 1 else results[0]

The solution passes the test cases, but does it meet the requirements? And if not, how can it be made to meet them?

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

Your algorithm is correct, but uses more than *O(1)* space, so it does not fit with the requirements.

This problem is a variation on the majority vote problem, where the occurrence factor must be more than *n/2* instead of *n/3*.

For the *n/2* case the Boyerâ€“Moore majority vote algorithm can be used, so I have been looking into finding a variation of this algorithm that would work for the *n/3* case.

The counter which is used in Boyer-Moore would have to remain positive for a longer time to ensure a frequent value "survives" long enough to be recognised as a solution. This can be achieved by incrementing the counter with 2 when the value matches the currently retained value, and decrementing it by 1 (like in Boyer-More) when it differs from it.

Although that looks promising, it will fail identifying candidates that occur while the counter is decrementing, since the number of decrements can be *2n/3*. The solution I propose to prevent this from happening is to use several counters, which work almost independently: they each track a *different* value. When one of them drops to zero, and the current value is not one of the other values being tracked, the zero-counter is bound to this new value, and starts incrementing again (with 2 at a time).

I found that with 4 such parallel counters -- and corresponding tracked values -- the algorithm never misses a good candidate.

When the end of the array is reached, the algorithm will have 4 candidates (if the array has at least 4 elements of course). The second sweep which is done in Boyer-More, is then also performed here, but by counting the occurrences of each of the 4 candidates. If any of those values occurs more than *n/3* times, the solution has been found. If not, it is concluded there is no such value in the array.

Compared with the Boyer-More algorithm, this variation of it uses 4 times more time and 4 times more space, but that does not change the complexity: it still spends *O(n)* time and uses *O(1)* space.

Here is the code:

```
def commonNumber(a):
counts = [0, 0, 0, 0]
values = [None, None, None, None]
for value in a:
is_not_tracked = value not in values
for i, count in enumerate(counts):
if count == 0 and is_not_tracked:
values[i] = value
counts[i] += 2 if value == values[i] else -1 if count else 0
# second phase: count the occurrences for each of the 4 candidates
for value in values:
if a.count(value) > len(a) // 3:
return value
return None
```

I tested this code with many arrays having a solution. Such arrays can be built by having -1 occurring just the minimum number of times needed to be considered a solution, then -2 occurring one time less, the value -3 once, and then a list of non-negative random values to fill up the remainder. That remainder list has fewer elements than needed for a value to be a solution, so such arrays always have -1 as solution.

Here is one of the code blocks I have tested with:

```
import random
size = 100
repeat = size//3 +1
a = [-1]*repeat + [-2]*(repeat-1) + [-3] + [int(i*random.random()) for i in range(size-2*repeat)]
for i in range(1000):
random.shuffle(a)
result = commonNumber(a)
# Verify that the outcome is as expected: it should be value -1
if result != -1:
print('fail:', a, result)
print('done')
```

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**