Behuvius Behuvius - 1 month ago 7
C++ Question

Comparing 2 std::lists of std::pairs - Keeping Unique and Different elements

I have 2 Lists of Pairs, each having matching pairs, non matching pairs, and unique pairs. I want to keep the unique and non matching pairs.

I made this code, which works great for finding the mismatches, but not the unique:

std::list <std::pair<std::string,std::string>> outputList1;
std::list <std::pair<std::string,std::string>> outputList2;
for (auto x: outputList1){
for (auto y: outputList2){
//compare x to y
if(x.first == y.first)
{
if(x.second != y.second)
{
//mismatch on the second element
//do something with it
}


}

}
}


This is what I tried to pull in the unique. I tried to check to see if I am at the last element of y, and if so, save that pair to pull out the unique elements, but it is pulling in everything. I tried writing this out on paper, but my head is spinning.

auto &last = *(--outputList2.end());
for (auto x: outputList1){
for (auto& y: outputList2){
//compare x to y
if(x.first == y.first)
{
if(x.second != y.second)
{
//mistmatch found so I save it off somewhere
}


}
else if(&y == &last)
{
//unique item found in list x
}

}
}


I also tried this to get the end of y on the else if:

else if(&y == &*std::prev(std::end(outputList2)))


Can you see what I'm doing wrong?

Answer

I think you search for std::set_symmetric_difference. In other words, you want to do something with all elements except those that exist in both lists (the matching ones). In other words, you want to discard the intersection. That is what std::set_symmetric_difference is build of.

The thing with std::set_difference and std::set_symmetric_difference is the ranges must be sorted. So, we sort the lists first, but in another lists to keep the original lists untouched and using reference wrappers to avoid so many copies of the inner pairs, sorting the references to those pairs. Since std::reference_wrapper has no defined operator<, we need to pass a comparator (a lambda function):

#include <iostream>
#include <algorithm>
#include <iterator>
#include <list>
#include <functional>

using pair_t = std::pair<std::string, std::string>;
using pair_cref = std::reference_wrapper<const pair_t>;

int main()
{
    std::list<pair_t> outputList1{{"c", "b"}, {"a", "g"}, {"0", "f"}};
    std::list<pair_t> outputList2{{"c", "d"}, {"0", "f"}, {"z", "1"}};

    std::list<pair_cref> sorted_view_1(outputList1.begin(), outputList1.end());
    std::list<pair_cref> sorted_view_2(outputList2.begin(), outputList2.end());

    auto less_f = [](const pair_cref& a, const pair_cref& b)
         { return a.get() < b.get(); };

    sorted_view_1.sort(less_f);
    sorted_view_2.sort(less_f);

    std::list<pair_cref> unmatchs_and_uniques;
    std::set_symmetric_difference(sorted_view_1.begin(), sorted_view_1.end(),
                                  sorted_view_2.begin(), sorted_view_2.end(),
                                  std::back_inserter(unmatchs_and_uniques), less_f);

    for (const auto& p : unmatchs_and_uniques)
        std::cout << p.get().first << ", " << p.get().second << std::endl;
}

Output:

a, g
c, b
c, d
z, 1

Coliru demo.

If you don't mind to modify the original lists, you can just do:

outputList1.sort();
outputList2.sort();

instead of creating the sorted views, and apply std::set_symmetric_difference directly over the outputListx.

The code is a bit more readable, though not shorter. In any case, that solution is faster, because yours has a

yours

complexity order, while using std::list::sort and std::set_symmetric_difference, the complexity order of the algorithm is the one of sort() (std::set_symmetric_difference has linear order):

mine

being nx the size of the longest list.