Michael WS Michael WS - 2 months ago 12
C++ Question

Merge N sorted in vectors c++

I have a vector of sorted c++ classes that have comparison operators. I need a vector of the objects sorted. Is there any open source algorithms that would take either a list of vectors or vector of vectors and make one sorted vector?

for n=2, this works fine:

std::vector<Data> dst;
std::merge(v1.begin(), v1.end(), v2.begin(), v2.end(), std::back_inserter(dst));

Answer

The standard library contains the merge function, allowing to merge two sorted vectors (in particular).

Say you have k sequences, with total length n. Of course, you could merge each time 2 sequences until only a single one remains, but this is known to be worst-case Θ(k2n).

There are two alternatives:

  • Merging By Greedy Size In this variant, each time, the two shortest sequences are merged. This [reduces the worst-case time to Θ(k log(k) n).

  • Merging With A Priority Queue In this scheme, a priority queue holds pairs of iterators and sequence indices, when a smallest element is popped, the next item from the sequence with this index is pushed. This has time Θ(log(k) n), as the priority queue has at most k items at each time.

The following contains an implementation of the latter (with the full code on ideone). (I suspect that, while asymptotically better than the former, it has much higher constants than the former. A generic implementation of the greedy algorithm is (at least for me) a bit trickier, though.)

include <vector>
#include <queue>
#include <iostream>


template<
    class InIts, 
    typename OutIt,
    class Cmp=std::less<typename std::iterator_traits<typename InIts::value_type::first_type>::value_type>>
OutIt merge(const InIts &in_its, OutIt out_it, Cmp cmp=Cmp())
{
    using it_t = typename InIts::value_type::first_type;
    using pair_t = std::pair<it_t, std::size_t>;
    auto pair_cmp = [cmp](const pair_t &lhs, const pair_t &rhs) { return !cmp(*lhs.first, *rhs.first); };
    using q_t = std::priority_queue<pair_t, std::vector<pair_t>, decltype(pair_cmp)>;

    std::vector<std::pair<it_t, it_t>> origs{in_its};
    q_t q{pair_cmp};

    for(std::size_t i = 0; i < origs.size(); ++i)
    {   
        auto &p = origs[i];
        if(p.first != p.second)
            q.push(std::make_pair(p.first++, i));
    }

    while(!q.empty()) 
    {
        auto t = q.top();
        *(out_it++) = *t.first;
        const auto i = t.second;
        q.pop();

        auto &p = origs[i];
        if(p.first != p.second)
            q.push(std::make_pair(p.first++, i));
    }

    return out_it;
}

You can use it as follows:

int main()
{
    using vec_t = std::vector<int>;

    vec_t v0{1, 2, 3}; 
    vec_t v1{2, 3, 4}; 
    vec_t v2{4, 5, 6}; 

    using vec_it_t = vec_t::iterator;

    std::vector<std::pair<vec_it_t, vec_it_t>> its{
        std::make_pair(std::begin(v0), std::end(v0)),
        std::make_pair(std::begin(v1), std::end(v1)),
        std::make_pair(std::begin(v2), std::end(v2))};

    vec_t res;
    merge(its, std::back_inserter(res));
    for(auto &e: res)
        std::cout << e << std::endl;
}