AwaitedOne AwaitedOne - 28 days ago 9
C++ Question

Generate hash for boost::dynamic_bitset and convert hash back to boost::dynamic_bitset

I am looking to generate a

boost::dynamic_bitset
hash so that I can store the value in
boost::bimaps
. I tried the following code,Test code here.

#include <iostream>
#include <boost/dynamic_bitset.hpp>
#include <unordered_map>
#include <boost/bimap.hpp>
#include <boost/bimap/unordered_set_of.hpp>
#include <boost/bimap/unordered_multiset_of.hpp>
#include <boost/bimap/set_of.hpp>
#include <boost/bimap/multiset_of.hpp>

#define BOOST_DYNAMIC_BITSET_DONT_USE_FRIENDS

namespace boost {
template <typename B, typename A>
std::size_t hash_value(const boost::dynamic_bitset<B, A>& bs) {
return boost::hash_value(bs.m_bits);
}
}

namespace bimaps = boost::bimaps;
typedef boost::bimap<bimaps::unordered_set_of<unsigned long long int>,
bimaps::unordered_multiset_of<size_t> > bimap_reference;
typedef bimap_reference::value_type position;
bimap_reference reference_index_vector;

int main()
{

std::string str = "1011010001101101000001101101000011111111011010000011011010000111111111110110100011011010000011011010000111111110110100000110110100001111111111";
boost::dynamic_bitset<> bits = boost::dynamic_bitset<> (str);
std::cout << "bitmap " << bits << std::endl;
std::cout << "Number of bits " << bits.count() << std::endl;

size_t hash1 = boost::hash_value (bits);
std::cout << "Hash value " << hash1 << std::endl;


/* Insert hash value in bimap
*
*/

// reference_index_vector.insert(position(10000000000, hash1));
// for( bimap_reference::const_iterator iter = reference_index_vector.begin(), iend = reference_index_vector.end();
// iter != iend; ++iter ) {
// std::cout << iter->left << " <--> "<< iter->right <<std::endl;
// }

return 0;
}


I get the error


In file included from /usr/include/boost/dynamic_bitset.hpp:15:0, from 3: In instantiation of 'std::size_t boost::hash_value(const boost::dynamic_bitset&) [with B = long unsigned int; A = std::allocator; std::size_t = long unsigned int]': 34:40: required from here /usr/include/boost/dynamic_bitset/dynamic_bitset.hpp:422:17: error: 'boost::dynamic_bitset<>::buffer_type boost::dynamic_bitset<>::m_bits' is private buffer_type m_bits; ^ 16:37: error: within this context


Not sure what is going wrong.


  1. How to hash
    boost::dynamic_bitset

  2. How to convert the hash back to orignial bitset.

  3. Total space needed (count of both 0 and 1 or only 1). Above code shows 80 bits only by
    bits.count()
    . I tried the following to generate the hash value, but not sure how much space is needed.



Also, I tried to generate the hash value of the bitset by the following code

/*Generating hash by bitset
*
*/
std::bitset<142> seq (str);
std::hash<std::bitset<142>> hash_bitset;
std::cout << "Bitset " << seq << std::endl;
std::cout << "Hash value " << hash_bitset(seq) << std::endl;

#Bitset 1011010001101101000001101101000011111111011010000011011010000111111111110110100011011010000011011010000111111110110100000110110100001111111111
#Hash value 4886653603414440856

Answer Source

Ok, I'm detecting much confusion about the essence of "hashing", so a few friendly pointers to get started:

Q. 2. How to convert the hash back to orignial bitset.

That's impossible. A hash is lossy digest. You can only do this if the hash is a Perfect Hash which, due to the laws of entropy, cannot happen if the bitset capacity exceeds the size of size_t on your platform (typically 32 or 64 bits).

Q. I also tried creating a hash by ...

 std::bitset<142> seq (str);
 ....

I hope you realize that std::bitset<> is an entirely different type, so it's not really related to the task. And, since it's not dynamic, it's rather unhelpful for the task, even as a workaround.

But Most Importantly:

Hashes are used by hash-tables (like unordered_*<>) but they are not stored. Hashes are lossy digests, only used to get a good distribution over the internal buckets¹. For actual element equality, std::equal<T> is still used.

In other words:

typedef boost::bimap<bimaps::unordered_set_of<unsigned long long int>,
        bimaps::unordered_multiset_of<size_t> > bimap_reference;

is unsuited for creating a map of anything other than size_t or unsigned long long². If you store hashes of things there:

reference_index_vector.insert(position(10000000000, hash1));

, you lose the original information. There's no way to get the bitset from hash1.

The compiler error

Your hash_value implementation mistakenly uses private members of dynamic_bitset<>. You can't because it's not accessible.

Here's a simple implementation of std::hash<> using the public interface:

Live On Coliru

#include <boost/dynamic_bitset.hpp>
#include <boost/functional/hash.hpp>
#include <unordered_map>
#include <sstream>

namespace std {

    template <typename Block, typename Alloc> struct hash<boost::dynamic_bitset<Block, Alloc> > {
        size_t operator()(boost::dynamic_bitset<Block, Alloc> const& bs) const {
            size_t seed = boost::hash_value(bs.size());

            std::vector<Block> blocks(bs.num_blocks());
            boost::hash_range(seed, blocks.begin(), blocks.end());

            return seed;
        }
    };

}

int main() {
    boost::dynamic_bitset<> x, y;
    x.resize(rand()%100, 1);
    y.resize(rand()%100, 0);

    std::unordered_map<boost::dynamic_bitset<>, std::string> m;
    m[x] = "x";
    m[y] = "y";
}

You can use this std::hash<> specialization instead and use boost::bimap with it.

NOTE, that using the public interface is not optimal, because it copies the Blocks (you also did that with the std::bitset<> hack). You might be interested in the Boost Serialization implementation I did for boost::dynamic_bitset<> before:


¹ Assuming, for simplicity, buckets instead of "open addressing" style. The same logic applies there, but somewhat more complicated

² (by the way, please just say uintmax_t or uint64_t)