SolsmaDev - 6 months ago 33
Java Question

# O(n) algorithm to find the odd-number-out in array (not odd numbers)

I'm having a great deal of trouble trying to figure this question out, and the root of that trouble is creating an algorithm of

`O(n)`
complexity. Here's the question I'm struggling with:

An Array
`A`
of length
`n`
contains integers from the range
`[0, .., n - 1]`
. However, it only contains
`n - 1`
distinct numbers. So one of the numbers is missing and another number is duplicated. Write a Java method that takes
`A`
as an input argument and returns the missing number; the method should run in
`O(n)`
.

For example, when
`A = [0, 2, 1, 2, 4]`
,
`oddOneOut()`
should return
`3`
; when
`A = [3, 0, 0, 4, 2, 1]`
,
`oddOneOut()`
should return
`5`
.

Obviously this is an easy problem to solve with an
`O(n2)`
algorithm, (and most likely
`O(n)`
, I'm just not seeing it!). I've attempted to solve it using all manner of methods, but to no avail. I'm attempting to solve it in Java, but if you're more comfortable solving it Python, that would be fine as well.

Suppose the number missing is `x` and the duplicate is `y`. If you add all numbers, the sum will be:

``````(n - 1) * n / 2 - x + y
``````

From the above, you can find `(x - y)`.....(1)

Similarly, sum the squares of the numbers. The sum will then be:

`(n - 1) * n * (2 * n - 1) / 6 - x2 + y2`

From the above you get `(x2 - y2)`....(2)

``````(2) / (1) gives (x + y).....(3)
``````

(1) + (3) gives `2 * x` and you can thereby find `x` and `y`.

Note that in this solution there is `O(1)` extra storage and is `O(n)` time complexity. The other solutions above are unnecessarily `O(n)` extra storage.

Code in mixed C/C++ for some more clarity:

``````#include <stdio.h>

int findDup(int *arr, int n, int& dup, int& missing)
{
int sum = 0;
int squares = 0;

for (int i = 0; i < n; i++) {
sum += arr[i];
squares += arr[i] * arr[i];
}

sum = (n - 1) * n / 2 - sum; // x - y

squares = (n - 1) * n * (2 * (n - 1) + 1) / 6 - squares; // x^2 - y^2

if (sum == 0) {
// no duplicates
missing = dup = 0;
return -1;
}
missing = (squares / sum + sum) / 2; // ((x^2 - y^2) / (x - y) + (x - y)) / 2 = ((x + y) + (x - y)) / 2 = x

dup = missing - sum; // x - (x - y) = y

return 0;
}

int main(int argc, char *argv[])
{
int dup = 0;
int missing = 0;

int a[] = {0, 2, 1, 2, 4};

findDup(a, sizeof(a) / sizeof(int), dup, missing);
printf("dup = [%d], missing = [%d]\n", dup, missing);

int b[] = {3, 0, 0, 4, 2, 1};
findDup(b, sizeof(b) / sizeof(int), dup, missing);
printf("dup = [%d], missing = [%d]\n", dup, missing);

return 0;
}
``````

Output:

``````dup = [2], missing = [3]
dup = [0], missing = [5]
``````