user3067923 user3067923 - 7 months ago 129
R Question

making function that checks if vector exists in matrix faster

I have the following function (funtest) to test if a specific vector exists in a matrix. The vector will always be length 2 and the matrix will always have two columns. The function works fine, I would just like to make it faster (ideally much faster), because my matrices can have hundreds to thousands of rows.

x = c(1,2)

m <- matrix(sample(c(1,-2,3,4), 500*2, replace=TRUE), ncol=2)

[1] TRUE

This is how fast it currently is

microbenchmark(funtest(m, x), times=100)
Unit: milliseconds
expr min lq mean median uq max
funtest(m, x) 1.501247 1.536157 1.674668 1.567826 1.708293 2.900046

This is the function

funtest = function(m, x) {
out = any(apply(m,1,function(n,x) all(n==x),x=x))


How about

paste(x[1], x[2], sep='&') %in% paste(m[,1], m[,2], sep='&')

This should be super efficient! It is based on matching. As soon as the first match is found, no further search will be done!

However I am sure this is not the fastest. The optimal solution is to write this operation in C code with a single while loop. But, the potential speedup factor should be no more than 2.