Debasish Mitra Debasish Mitra - 1 year ago 87
Python Question

Fastest way to get hamming distance for integer array

Let a and b be vectors of the same size with 8-bit integers (0-255). I want to compute the number of bits where those vectors differs i.e. a Hamming distance between vectors formed by concatenation of binary representations of those numbers. For example:

a = [127,255]
b= [127,240]

Using numpy library

# Output: array([ 0, 15])

What I need is now to binary represent each element of the above array and count number of 1s in all the elements of the array. The above example will give hamming distance of 0+4 = 4. Any fast and elegant solution for this in Python?

Answer Source

You could broadcast them into binary bits and get the count of different bits, like so -

def hamming_distance(a, b):
    r = (1 << np.arange(8))[:,None]
    a0 = ((a & r) > 0)
    b0 = ((b & r) > 0)
    return np.count_nonzero(a0!=b0)

Sample run -

In [144]: a = [127,255]
     ...: b = [127,240]

In [145]: hamming_distance(a, b)
Out[145]: 4