ChuckP ChuckP - 28 days ago 18
R Question

Double sums in r using outer

I'm trying to evaluate the following double sums in r:

enter image description here

I know that outer is quick way to do it. I've tried the following

sum(f(outer(X,X,function(x,y) (x-y)/c)))


and while it seems to work, I'm not sure just how fast it is compare to some alternatives? Does it make a difference in terms of speed to do outer first then my function or vice versa? is there a better way to do it?

Answer

I think outer is actually faster enough for your application. Though in theory it is not the fastest given the symmetry of your computation, the fastest method can at most be 2 times faster anyway.

The following function gives this optimal implementation. We generate (i, j) indices for the lower triangular part (excluding diagonal) of the combination matrix. f will be only applied on this set. Due to symmetry, we multiple the sum over lower triangular by 2. Finally, we add the diagonal result which is just n * f(0).

## `x` is the vector, `f` is your function of interest, `c` is a constant
g <- function (x, f, c) {
  n <- length(x)
  j <- rep.int(1:(n-1), (n-1):1)
  i <- sequence((n-1):1) + j
  2 * sum(f((x[i] - x[j]) / c)) + n * f(0)
  }

Maybe we can consider a small example here. Let's assume c = 2 and f = sin. Suppose we have a vector x <- 1:500. Let's do a benchmark:

x <- 1:500
library(microbenchmark)
microbenchmark(sin(outer(x,x, function (x,y) (x-y)/2)), g(x, sin, 2))

#Unit: milliseconds
#                                       expr      min       lq     mean   median
# sin(outer(x, x, function(x, y) (x - y)/2)) 33.91868 37.60806 46.55688 38.71816
#                               g(x, sin, 2) 21.09788 22.81036 26.53589 23.53098
#       uq      max neval cld
# 40.12542 82.06012   100   b
# 25.00474 67.11804   100  a 
Comments