Andrei Pavel Andrei Pavel - 4 months ago 24
C++ Question

Same key, multiple entries for std::unordered_map?

I have a map inserting multiple values with the same key of

C string
type.

I would expect to have a single entry with the specified key.

However the map seems to take it's address into consideration when uniquely identifying a key.

#include <cassert>
#include <iostream>
#include <string>
#include <unordered_map>

typedef char const* const MyKey;

/// @brief Hash function for StatementMap keys
///
/// Delegates to std::hash<std::string>.
struct MyMapHash {
public:
size_t operator()(MyKey& key) const {
return std::hash<std::string>{}(std::string(key));
}
};

typedef std::unordered_map<MyKey, int, MyMapHash> MyMap;

int main()
{
// Build std::strings to prevent optimizations on the addresses of
// underlying C strings.
std::string key1_s = "same";
std::string key2_s = "same";
MyKey key1 = key1_s.c_str();
MyKey key2 = key2_s.c_str();

// Make sure addresses are different.
assert(key1 != key2);

// Make sure hashes are identical.
assert(MyMapHash{}(key1) == MyMapHash{}(key2));

// Insert two values with the same key.
MyMap map;
map.insert({key1, 1});
map.insert({key2, 2});

// Make sure we find them in the map.
auto it1 = map.find(key1);
auto it2 = map.find(key2);
assert(it1 != map.end());
assert(it2 != map.end());

// Get values.
int value1 = it1->second;
int value2 = it2->second;

// The first one of any of these asserts fails. Why is there not only one
// entry in the map?
assert(value1 == value2);
assert(map.size() == 1u);
}


A print in the debugger shows that map contains two elements just after inserting them.

(gdb) p map
$4 = std::unordered_map with 2 elements = {
[0x7fffffffda20 "same"] = 2,
[0x7fffffffda00 "same"] = 1
}


Why does this happen if the hash function which delegates to
std::hash<std::string>
only takes it's value into account (this is asserted in the code)?

Moreover, if this is the intended behaviour, how can I use a map with C string as key, but with a 1:1 key-value mapping?

Answer Source

The reason is that hash maps (like std::unordered_map) do not only rely on the hash function for determining if two keys are equal. The hash function is the first comparison layer, after that the elements are always also compared by value. The reason is that even with good hash functions you might have collisions where two different keys yield the same hash value - but you still need to be able to save both entries in the hashmap. There are various strategies to handle that, you can find more information on looking for collision resolution for hash maps.

In your examples both entries have the same hash value but different values. The values are just compared by the standard comparison function, which compares the char* pointers, which are different. Therefore the value comparison fails and you get two entries in the map. To solve your issue you also need to define a custom equality function for your hash map, which can be done by specifiying the fourth template parameter KeyEqual for std::unordered_map.