Jonathan Mee - 5 days ago 6
C++ Question

How is nth_element Implemented?

There are a lot of claims on StackOverflow and elsewhere that

`nth_element`
is O(n) and that it is typically implemented with Introselect: http://en.cppreference.com/w/cpp/algorithm/nth_element

I want to know how this can be achieved. I looked at Wikipedia's explanation of Introselect and that just left me more confused. How can an algorithm switch between QSort and Median-of-Medians?

In this paper, we concentrate on the sorting problem and return to the selection problem only briefly in a later section.

I've tried to read through the STL itself to understand how
`nth_element`
is implemented, but that gets hairy real fast.

Could someone show me pseudo-code for how Introselect is implemented? Or even better, actual C++ code other than the STL of course :)

You asked two questions, the titular one

How is nth_element Implemented?

There are a lot of claims on StackOverflow and elsewhere that nth_element is O(n) and that it is typically implemented with Introselect.

Which I also can confirm from looking at my stdlib implementation. (More on this later.)

And the one where you don't understand the answer:

How can an algorithm switch between QSort and Median-of-Medians?

Lets have a look at pseudo code that I extracted from my stdlib:

``````nth_element(first, nth, last)
{
if (first == last || nth == last)
return;

introselect(first, nth, last, log2(last - first) * 2);
}

introselect(first, nth, last, depth_limit)
{
while (last - first > 3)
{
if (depth_limit == 0)
{
// [NOTE by editor] This should be median-of-medians instead.
// [NOTE by editor] See Azmisov's comment below
heap_select(first, nth + 1, last);
// Place the nth largest element in its final position.
iter_swap(first, nth);
return;
}
--depth_limit;
cut = unguarded_partition_pivot(first, last);
if (cut <= nth)
first = cut;
else
last = cut;
}
insertion_sort(first, last);
}
``````

Without getting into details about the referenced functions `heap_select` and `unguarded_partition_pivot` we can clearly see, that `nth_element` gives introselect `2 * log2(size)` subdivision steps (twice as much as needed by quickselect in the best case) until `heap_select` kicks in and solves the problem for good.

Source (Stackoverflow)