hitchhiker - 6 months ago 24

R Question

I'm new to R. I've looked through many similar questions but not found anything that has helped me solve my problem.

Say I have a data frame dat created like so:

`dat <- data.frame(v1=rep(c("a","a","b","b"),3), v2=c(rep("x",4),rep("y",4),rep("z",4)), dv=sample(1:100, 12), id=rep(c("p1","p2"),6))`

...that looks like this:

`v1 v2 dv id`

1 a x 40 p1

2 a x 99 p2

3 b x 67 p1

4 b x 24 p2

5 a y 16 p1

6 a y 51 p2

7 b y 85 p1

8 b y 72 p2

9 a z 33 p1

10 a z 31 p2

11 b z 88 p1

12 b z 50 p2

I would like, for each condition/level of var2, to conduct a t test for difference between conditions a&b of var1.

I could do this by subsetting the data frame by level of var2 and then looping through applying the t test for diff between conditions a & b of var1, but as I understand it one of the strengths of R is avoiding loops (using apply and other related functions).

(Then I would of course correct for multiple comparisons)

Answer

One option that you have is the so-called `apply`

-family.

First you split your data up into the different `v1`

s, then you apply a function to all subsets.

Given that you want to conduct the t.test on the variable "dv" the approach would like this:

```
split_dat <- split(dat, dat$v2)
sapply(split_dat, function(sub_dat) {
result <- t.test(sub_dat[sub_dat$v1 == "a", "dv"],
sub_dat[sub_dat$v1 == "b", "dv"])
return(result$p.value)
})
# Result:
# x y z
# 0.1220663 0.6092622 0.8887763
```