DrPineapple - 9 months ago 39

R Question

Say that I have N identical (same number of rows and columns) dataframes:

`set.seed(2)`

df1 <- data.frame(replicate(100,rnorm(100)))

df2 <- data.frame(replicate(100,rnorm(100)))

dfN <- data.frame(replicate(100,rnorm(100)))

And I want to apply a function (in this case

`t.test()`

`one <- df1[1,1]`

two <- df2[1,1]

Nth <- dfN[1,1]

Perform a

`t.test()`

`first.cell.each <- cbind.data.frame(one,two,Nth)`

t.test(first.cell.each, mu=0)

And repeat that across all cells (in this case 10000).

edit: clarified

Answer Source

We can create a `matrix`

to store the output of `p.value`

of `t.test`

having the same dimensions of the individual datasets. Then, loop through the sequence of rows and columns, extract the elements from each of the datasets, concatenate, and do the `t.test`

and assign the output to the same row/column index of 'res'.

```
res <- matrix(, ncol=100, nrow=100)
for(i in seq_len(nrow(df1))){
for(j in seq_len(ncol(df1))){
res[i,j] <- t.test(c(df1[i,j], df2[i,j], dfN[i,j]), mu = 0)$p.value
}}
```

My code also returns a 100*100 matrix

```
str(res)
#num [1:100, 1:100] 0.629 0.5 0.131 0.769 0.348 ...
```

If there are many datasets, we can place it in a `list`

, then convert it to an `array`

and do the `t.test`

using `apply`

```
lst <- mget(paste0("df", c(1, 2, "N")))
ar1 <- array(unlist(lst), dim = c(dim(df1), length(lst)))
res2 <- apply(aperm(ar1, c(3, 1, 2)), c(2,3), FUN = function(x) t.test(x, mu = 0)$p.value)
str(res2)
# num [1:100, 1:100] 0.629 0.5 0.131 0.769 0.348 ...
```