magsd - 8 months ago 77

R Question

I want to apply the wilcox.test to each row of my two dataframes in R. For instance, to row 1 in df1 and row 1 in df2, to see if they differ significantly. I have hundreds of rows and expect hundreds of P-values to be the outcome. There are 105 columns. I am not quite sure how to write a command that does the test for each of my row pairs, since there are hundreds of them. Any help is appreciated!

Answer

Using the following data as an example:

```
#2 numeric data.frames (all columns are numeric)
#5 rows and 100 columns
set.seed(5)
df1 <- as.data.frame(matrix(runif(500), nrow=5, ncol=100))
df2 <- as.data.frame(matrix(runif(500), nrow=5, ncol=100))
```

Solution

```
#A single lapply is enough to run the wilcox test for each row
lapply(1:nrow(df1), function(i) {
#you run the wilcox.test for each pair of rows and return the p.value
wilcox.test(as.numeric(df1[i, ]), as.numeric(df2[i, ]))$p.value
})
```

Output:

```
> lapply(1:nrow(df1), function(i) {
+ wilcox.test(as.numeric(df1[i, ]), as.numeric(df2[i, ]))$p.value
+ })
[[1]]
[1] 0.8690001
[[2]]
[1] 0.1390142
[[3]]
[1] 0.7479788
[[4]]
[1] 0.5340455
[[5]]
[1] 0.8459806
```

Source (Stackoverflow)