Paul Terwilliger Paul Terwilliger - 3 months ago 16
R Question

Masking Highest Set Bit in R?

I'm working in R. I need to find a fast function that will mask the highest set bit of an integer. For example:

# 6 binary is 110, this should turn into 010 which is 2
function_mask(6) = 2

# 8 in binary is 1000, this should turn into 0000
function_mask(8) = 0


This is equivalent to subtracting the closest lower power of two. I will be happy if I can find a fast function that will simply find the closest lower power of two. For example:

# 6 in binary is 110, the MSB is 100
function_power_two(6) = 4
function_mask(6) = 6 - function_power_two(6) = 2

# 8 in binary is 1000, the MSB is 1000 which is 8 in base 10
function_power_two(8) = 8
function_mask(8) = 8 - function_power_two(8) = 0


I have found bitwise operations in R: for example, bitwShiftL and bitwShiftR. However, I do not know how to implement a solution in R.

I have seen solutions in other languages: Java, C, and C++. However, I do not know how to implement these solutions in R.

There are solutions in C++ using Rcpp, however Rcpp does not support integers larger than 32-bit. I need larger integers than that.

dww dww
Answer

For an R solution:

function_mask <- function(x) {
  xb <-intToBits(x)
  highestbit <- length(xb) - match("01", rev(xb)) 
  x-2L^highestbit
}

Comparing speeds to the other answers, we see this one is fastest, so far.

function_mask1 <- function(x) {
  bits = intToBits(x)                 # Convert integer to binary vector
  ii = tail(which(bits > 0), n=1)     # Identify most significant bit
  bits[ii] = as.raw(0)                # Mask the most significant bit
  out = packBits(bits,type='integer') # Convert binary back to integer
  return(out)
}

maskHighBit <- function(x){
  strtoi(gsub("^1", "", R.utils::intToBin(as.integer(x))), base=2)}

library(microbenchmark)
microbenchmark(function_mask(112L), function_mask1(112L), maskHighBit(112L), times=1000)
#Unit: microseconds
#expr     min      lq      mean  median      uq      max neval cld
#function_mask(112L)  17.961  20.014  23.65080  23.092  24.632   57.475  1000 a  
#function_mask1(112L)  39.514  44.132  49.79744  47.724  49.777  107.765  1000  b 
#maskHighBit(112L) 108.791 114.435 127.53792 118.540 133.422 2054.189  1000   c