JMM - 6 months ago 46

R Question

I have two sets of data, which contain columns with the same names, but differing values in those columns. e.g:

`m1 <- matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE,`

dimnames = list(c("s1", "s2", "s3"),c("cow", "dog","cat")))

m2 <- matrix(1:9, nrow = 3, ncol = 3, byrow = FALSE,

dimnames = list(c("s1", "s2", "s3"),c("dog", "cow","cat")))

> m1

cow dog cat

s1 1 2 3

s2 4 5 6

s3 7 8 9

> m2

dog cow cat

s1 1 4 7

s2 2 5 8

s3 3 6 9

I would like to create a function using cor.test() to calculate the correlation between corresponding columns. E.g. cow vs cow, dog vs dog. The reason for using cor.test() is I want to obtain the correlation coefficient and p-value. So, if there are other ways to obtain this information, i'm open to those too. The actual data set has thousands of columns, which are randomly organized, so im looking for a way to match the columns first and then calculate the correlation. Any ideas?

Answer

Here is a solution, using `lapply`

on common columns:

```
# Common columns
cols <- intersect(colnames(m1), colnames(m2))
# For each column, compute cor test
res <- lapply(cols, function(x) cor.test(
m1[, x],
m2[, x]
))
names(res) <- cols
```

The result is a list of `htest`

objects that you can access this way: `res[["cow"]]`