Aman Singhal Aman Singhal - 9 months ago 57
C++ Question

ith order statistic using C++'s STL

Given an empty array, I need to make two type of queries

  1. Inserting an element in the array

  2. Finding the index of some element k (obviously the array has to be kept sorted)

This can be done be using

set<int> st;

This will insert my element in

And for 2nd query

set<int>::iterator it;
it = st.find(k);
idx = distance(st.begin(), it);

This takes
time. (
[ +
] ).

Is there any way to do both queries in
using the predefined containers of C++?

Answer Source

No. It is not possible (with the predefined containers). The sequence containers of the C++ Standard Library have either:

  • O(1) random access and O(N) insertion/removal or
  • O(N) random access and O(1) insertion/removal

Note that deque is an exception, but only when the insertion/removal takes place at the ends of the array. The general case is still O(N).

Furthermore, the classification of iterators does not include a category for this case. You have the bidirectional iterators (those of list, set, multiset, map and multimap), which take O(N) time to jump to a random position, and the next category is for random access iterators (those of vector, deque and string). There is no intermediate category.

Adding a new category would not be trivial at all. The library also implements a lot of algorithms (like for_each) that work with containers. There is an implementation for every iterator category.

Order statistic trees have been proposed at Boost several times. As far as I know:

  1. 2004: First suggestion (I don't know if it was implemented)
  2. 2006: "Hierarchical Data Structures"
  3. 2006: AVL Array (renamed as "Rank List" in Boost)
  4. 2012: Counter tree

The main difficulty for them being accepted was the generalized opinion that they were not a benefit, but a hazard. Today's programmers are used to solve all the problems they know with the typical containers. Experienced programmers fear that newbies might blindly use the proposed container for everything, instead of choosing carefully.