Jonathan Mee - 6 months ago 71

C++ Question

There are a lot of claims on StackOverflow and elsewhere that

`nth_element`

I want to know how this can be achieved. I looked at Wikipedia's explanation of Introselect and that just left me more confused. How can an algorithm switch between QSort and Median-of-Medians?

I found the Introsort paper here: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.5196&rep=rep1&type=pdf But that says:

In this paper, we concentrate on the sorting problem and return to the selection problem only briefly in a later section.

I've tried to read through the STL itself to understand how

`nth_element`

Could someone show me pseudo-code for how Introselect is implemented? Or even better, actual C++ code other than the STL of course :)

Answer

You asked two questions, the titular one

How is nth_element Implemented?

Which you already answered:

There are a lot of claims on StackOverflow and elsewhere that nth_element is O(n) and that it is typically implemented with Introselect.

Which I also can confirm from looking at my stdlib implementation. (More on this later.)

And the one where you don't understand the answer:

How can an algorithm switch between QSort and Median-of-Medians?

Lets have a look at pseudo code that I extracted from my stdlib:

```
nth_element(first, nth, last)
{
if (first == last || nth == last)
return;
introselect(first, nth, last, log2(last - first) * 2);
}
introselect(first, nth, last, depth_limit)
{
while (last - first > 3)
{
if (depth_limit == 0)
{
// [NOTE by editor] This should be median-of-medians instead.
// [NOTE by editor] See Azmisov's comment below
heap_select(first, nth + 1, last);
// Place the nth largest element in its final position.
iter_swap(first, nth);
return;
}
--depth_limit;
cut = unguarded_partition_pivot(first, last);
if (cut <= nth)
first = cut;
else
last = cut;
}
insertion_sort(first, last);
}
```

Without getting into details about the referenced functions `heap_select`

and `unguarded_partition_pivot`

we can clearly see, that `nth_element`

gives introselect `2 * log2(size)`

subdivision steps (twice as much as needed by quickselect in the best case) until `heap_select`

kicks in and solves the problem for good.

Source (Stackoverflow)