Phdaml - 1 year ago 63
R Question

# add a vector to all rows of a matrix

I am maximizing a likelihood function and trying to reduce the loop.
I want to add the vector(parameters to be estimated) to all rows of a matrix (data). The length of vector equals to the column of matrix.

`a+b`
would give the wrong results because the recycle rule of R is by column not row.

``````a<-c(1,2,0,0,0)  # parameters to be optimized
b<-matrix(1,ncol=5,nrow=6) # data
t(a+t(b)) # my code would work, anything more intuitive?
``````

Desired output

``````        [,1] [,2] [,3] [,4] [,5]
[1,]    2    3    1    1    1
[2,]    2    3    1    1    1
[3,]    2    3    1    1    1
[4,]    2    3    1    1    1
[5,]    2    3    1    1    1
[6,]    2    3    1    1    1
``````

The wrong output

``````a+b
[,1] [,2] [,3] [,4] [,5]
[1,]    2    3    1    1    1
[2,]    3    1    1    1    2
[3,]    1    1    1    2    3
[4,]    1    1    2    3    1
[5,]    1    2    3    1    1
[6,]    2    3    1    1    1
``````

We can use `col` to replicate the 'a' elements

``````b + a[col(b)]
#     [,1] [,2] [,3] [,4] [,5]
#[1,]    2    3    1    1    1
#[2,]    2    3    1    1    1
#[3,]    2    3    1    1    1
#[4,]    2    3    1    1    1
#[5,]    2    3    1    1    1
#[6,]    2    3    1    1    1
``````

Or a faster option would be to use `rep`

``````b + rep(a, each = nrow(b))
``````

Or use `sweep`

``````sweep(b, 2, a, "+")
``````

### Benchmarks

``````set.seed(24)
b <- matrix(sample(0:9, 8000*5000, replace=TRUE), ncol=5000)
a <- sample(0:3, 5000, replace=TRUE)
system.time(b + a[col(b)])
#  user  system elapsed
#  1.08    0.06    1.14
system.time(b + rep(a, each = nrow(b)))
#   user  system elapsed
#   0.83    0.03    0.86

system.time(t(a+t(b)))
#   user  system elapsed
#   1.14    0.03    1.17

system.time(sweep(b, 2, a, "+"))
#  user  system elapsed
#  0.62    0.06    0.69
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download