ChuckP - 6 months ago 48

R Question

I'm trying to evaluate the following double sums in r:

I know that outer is quick way to do it. I've tried the following

`sum(f(outer(X,X,function(x,y) (x-y)/c)))`

and while it seems to work, I'm not sure just how fast it is compare to some alternatives? Does it make a difference in terms of speed to do outer first then my function or vice versa? is there a better way to do it?

Answer

I think `outer`

is actually faster enough for your application. Though in theory it is not the fastest given the symmetry of your computation, the fastest method can at most be 2 times faster anyway.

The following function gives this optimal implementation. We generate `(i, j)`

indices for the lower triangular part (excluding diagonal) of the combination matrix. `f`

will be only applied on this set. Due to symmetry, we multiple the sum over lower triangular by 2. Finally, we add the diagonal result which is just `n * f(0)`

.

```
## `x` is the vector, `f` is your function of interest, `c` is a constant
g <- function (x, f, c) {
n <- length(x)
j <- rep.int(1:(n-1), (n-1):1)
i <- sequence((n-1):1) + j
2 * sum(f((x[i] - x[j]) / c)) + n * f(0)
}
```

Maybe we can consider a small example here. Let's assume `c = 2`

and `f = sin`

. Suppose we have a vector `x <- 1:500`

. Let's do a benchmark:

```
x <- 1:500
library(microbenchmark)
microbenchmark(sin(outer(x,x, function (x,y) (x-y)/2)), g(x, sin, 2))
#Unit: milliseconds
# expr min lq mean median
# sin(outer(x, x, function(x, y) (x - y)/2)) 33.91868 37.60806 46.55688 38.71816
# g(x, sin, 2) 21.09788 22.81036 26.53589 23.53098
# uq max neval cld
# 40.12542 82.06012 100 b
# 25.00474 67.11804 100 a
```