Jim Slonder Jim Slonder - 1 month ago 10
R Question

Summing nearby elements of a matrix in R

In R, I'm trying to make a simple function like the one below, just summing the elements in the row of a data frame which are k positions away the (i,j) element. If the element is on the edge (e.g. j=1 or j=n) I'd like for the element to the left or right which doesn't exist to be treated as 0. But with my current function I end up with an error if the element to the right doesn't exist, or a vector if the one on the left doesn't exist due to R's behavior with negative indices. Is there a nicer way to write this function without just using if statements to deal with the three cases (element is in the middle, too far left, or too far right)?

sum_nearby <- function(dat, i, j, k) {
dat[i, j - k] + dat[i, j + k]
}


Edit: I figured out a solution. Just pad the dataframe with zeros.

sum_nearby <- function(dat, i, j, k) {
for(i in 1:k) dat <- cbind(0,dat,0)
dat[i, j] + dat[i, j + 2*k]
}

Answer

You can do

sum_nearby <- function(dat, i, j, k) {
  left <- max(1, j - k)
  right <- min(j + k, ncol(dat))
  dat[i, left] + dat[i, right]
  }

This means that close to the boundary, the k-neighbourhood is not symmetric.

Let's consider a simplified case / example with a vector:

f <- function (x, j, k) {
  left <- max(1, j - k)
  right <- min(j + k, length(x))
  x[left] + x[right]
  }

Say

x <- seq(2, 10, by = 2)
# [1] 2 4 6 8 10

Let's test the summation effect for all elements with k = 2:

sapply(1:5, f, k = 2, x = x)
# [1]  8 10 12 14 16
  • The first 8 is actually x[1] + x[3], instead of x[-1] + x[3].
  • The second 10 is x[1] + x[4], rather than x[0] + x[4].

If you simply want to ignore those "out-of-bound" values, use an if:

sum_nearby <- function(dat, i, j, k) {
  if (j - k < 0) dat[i, j + k]
  else if (j + k > ncol(dat)) dat[i, j - k]
  else dat[i, j + k] + dat[i, j - k]
  }