ApathyBear - 3 months ago 10

Java Question

I am working out a solution to the following question:

Describe an algorithm to find the smallest one million numbers in one

billion numbers. Assume that the computer memory can hold all one

billion numbers.

The book gives the a selection rank solution but I am having a hard time understanding a few parts of it:

`public static int partition(int[] array, int left, int right, int pivot) {`

while (true) {

while (left <= right && array[left] <= pivot) {

left++;

}

while (left <= right && array[right] > pivot) {

right--;

}

if (left > right) {

return left - 1;

}

swap(array, left, right);

}

}

public static int rank(int[] array, int left, int right, int rank) {

int pivot = array[randomIntInRange(left, right)];

int leftEnd = partition(array, left, right, pivot); // returns end of left partition

int leftSize = leftEnd - left + 1;

if (leftSize == rank + 1) {

return max(array, left, leftEnd);

} else if (rank < leftSize) {

return rank(array, left, leftEnd, rank);

} else {

return rank(array, leftEnd + 1, right, rank - leftSize);

}

}

I understand most of it, but I do no understand the following two lines above:

`if (leftSize == rank + 1) {`

return max(array, left, leftEnd);

`array[left:leftEnd]`

Answer

Congratulations on trying to learn something by carefully studying a book. It's a key skill that seems to be getting rarer.

It makes general sense if the definition of the return value of `rank`

is "there exist exactly a million numbers less than or equal to `rank`

." The definition of `max`

would be something like:

```
int t = array[left];
for (int i = left + 1; i <= leftEnd; i++)
t = Math.max(t, array[i]);
return t;
```

Returning the max value is beyond the problem statement and kind of weird. It would be better and simpler just to partition the elements so that the max million are at the top: `array[0] through array[999999]`

. Then find the max only if that's actually needed

Note that because `rank`

is tail recursive, there's a simple iterative version of the same code that I think would be clearer.

I'm also not convinced this code is correct. `leftSize == rank`

in the check makes more sense than `leftSize == rank + 1`

. But without more definitions and calling code, it's hard to say for sure.