JSH - 6 months ago 43

R Question

I am trying to extend the answer of a question R: filtering data and calculating correlation.

To obtain the correlation of temperature and humidity for each month of the year (1 = January), we would have to do the same for each month (12 times).

`cor(airquality[airquality$Month == 1, c("Temp", "Humidity")])`

Is there any way to do each month automatically?

In my case I have more than 30 groups (not months but species) to which I would like to test for correlations, I just wanted to know if there is a faster way than doing it one by one.

Thank you!

Answer

```
cor(airquality[airquality$Month == 1, c("Temp", "Humidity")])
```

gives you a `2 * 2`

matrix rather than a number. If you do want a matrix for each `Month`

, then use

```
lst <- lapply(split(airquality[, c("Temp", "Humidity")], airquality$Month), cor)
```

so that you get a list, each of its element storing a matrix.

But if you want a single number for each `Month`

, use

```
mapply(cor, with(airquality, split(Temp, Month)),
with(airquality, split(Humidity, Month)))
```

so that you get a vector.

**Reproducible example**

The `airquality`

dataset in R does not have `Humidity`

column, so I will use `Wind`

for testing:

```
x <- mapply(cor, with(airquality, split(Temp, Month)),
with(airquality, split(Wind, Month)))
# 5 6 7 8 9
#-0.3732760 -0.1210353 -0.3052355 -0.5076146 -0.5704701
```

We get a named vector, where `names(x)`

gives `Month`

, and `unname(x)`

gives correlation.

Source (Stackoverflow)