Lfppfs - 6 months ago 19

R Question

I have a df with 3 columns:

- column_1: numeric
- column_2: numeric
- column_3: factor variable with two groups, A and B

I want to compute a Spearman's correlation test between columns 1 and 2, but only between groups (so the correlation is computed only between observations of columns 1 and 2 which match group A, the same applying to group B).

So I'm using these lines of code:

`cor.test(df$column_1, df$column_2, alternative = ("two.sided"),`

subset(df, column_3==c("group_A")),

data = df, method = c("spearm"))

cor.test(df$column_1, df$column_2, alternative = ("two.sided"),

subset(df, column_3==c("group_B")),

data = df, method = c("spearm"))

Thing is, I get the same result in both tests, so I guess the subset function is not working, because if I previously subset the groups, like this:

`x <- subset(df, column_3==c("group_A"))`

y <- subset(df, column_3==c("group_B"))

And then run

`cor.test`

PS: I get the following warning, but I don't think it has to do with the issue I'm asking about:

`Warning message:`

"In cor.test.default(cor_itir$Nart, cor_itir$Medida, alternative = "two.sided", :cannot compute exact p-value with ties"

Answer

You're overcomplicating things a bit, by using `df$...`

extractors and specifying `data=`

and using `subset()`

as a standalone function. You can get the same results I believe using something like:

```
# here's some example data with different correlations between each group
df <- data.frame(column_1=1:10,column_2=c(1:5,6,4,3,11,9),column_3=rep(c("a","b"),each=5))
```

Then just specify your forumula, your `data=`

and your `subset=`

inline:

```
cor.test(~ column_1 + column_2, alternative="two.sided", data=df, subset=(column_3=="a"))
cor.test(~ column_1 + column_2, alternative="two.sided", data=df, subset=(column_3=="b"))
```

Or all in one go using `by`

```
by(df, df$column_3, FUN = function(x) cor.test(~ column_1 + column_2, data = x))
```