domaeg - 3 months ago 14

R Question

I have a vector that provides how many "1" each row of a matrix has. Now I have to create this matrix out of the vector.

For example, let say I want to create a 4 x 9 matrix

`out`

`v <- c(2,6,3,9)`

`out`

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]

[1,] 1 1 0 0 0 0 0 0 0

[2,] 1 1 1 1 1 1 0 0 0

[3,] 1 1 1 0 0 0 0 0 0

[4,] 1 1 1 1 1 1 1 1 1

I've done this with a

`for`

`out <- NULL`

for(i in 1:length(v)){

out <- rbind(out,c(rep(1, v[i]),rep(0,9-v[i])))

}

Has anyone an idea for a faster way to create such a matrix?

Answer

Here is my approach using `sapply`

and `do.call`

and some timings on a small sample.

```
library(microbenchmark)
library(Matrix)
v <- c(2,6,3,9)
microbenchmark(
roman = {
xy <- sapply(v, FUN = function(x, ncols) {
c(rep(1, x), rep(0, ncols - x))
}, ncols = 9, simplify = FALSE)
xy <- do.call("rbind", xy)
},
fourtytwo = {
t(vapply(v, function(y) { x <- numeric( length=9); x[1:y] <- 1;x}, numeric(9) ) )
},
akrun = {
m1 <- sparseMatrix(i = rep(seq_along(v), v), j = sequence(v), x = 1)
m1 <- as.matrix(m1)
})
Unit: microseconds
expr min lq mean median uq
roman 26.436 30.0755 36.42011 36.2055 37.930
fourtytwo 43.676 47.1250 55.53421 54.7870 57.852
akrun 1261.634 1279.8330 1501.81596 1291.5180 1318.720
```

and for a bit larger sample

```
v <- sample(2:9, size = 10e3, replace = TRUE)
Unit: milliseconds
expr min lq mean median uq
roman 33.52430 35.80026 37.28917 36.46881 37.69137
fourtytwo 37.39502 40.10257 41.93843 40.52229 41.52205
akrun 10.00342 10.34306 10.66846 10.52773 10.72638
```

With a growing object size, the benefits of `spareMatrix`

come to light.