Steen Harsted - 3 years ago 183

R Question

I am a rookie STATA user trying to make the jump to R. I am working through various exercises, but keep getting something wrong with the group_by and subset command.

I have a simple dataset that I wish to make groupbased calculations on. I am trying to use the groups_by command from the dplyr package to do this.

My dataset is called itchy and consists of 4 variabels:

treat- levels A and B (type of treatment)

type- levels Dark and Fair (skin-colour)

y - levels 0 and 1 (failure or succes of treatment)

freq - numerical variable indicating how many are in this particular group

Using this code you can recreate it:

`type <- c(2,2,2,2,1,1,1,1)`

treat <-c(1,1,2,2,1,1,2,2)

y <- c(1,0,1,0,1,0,1,0)

freq <- c(9,17,5,20,10,15,3,20)

itchy <- cbind.data.frame(type,treat,y,freq)

itchy$type <- as.factor(type)

itchy$type <- factor(itchy$type,levels = c(1,2), labels = c("Dark", "Fair"))

itchy$treat <- as.factor(treat)

itchy$treat <- factor(itchy$treat,levels = c(1,2), labels = c("A", "B"))

itchy$y <- as.factor(y)

itchy$y <- factor(itchy$y,levels = c(0,1), labels = c("failure", "succes"))

Now I would like to calculate the ods for a success for treatment A and B when applied to skintype Dark or Fair. (ods = nr of successful events/nr of failures)

I have two questions:

1) Can you help me do the ods calculations by groups?

2) I have tried with various combinations of group_by and subset, without any luck. The below code shows some of my unsuccessful attempts. Can you then tell I have a basic misunderstanding of how the group_by and subset commands work

`itchy %>% group_by(treat, type) %>% summarize(ods = (subset(freq, y==1)/subset(freq, y==0)))`

itchy %>% group_by(treat, type) %>% ods <- c((subset(freq, y==1)/subset(freq, y==0)))

itchy %>% group_by(treat, type) %>% itchy$ods <- (subset(freq, y==1)/subset(freq, y==0))

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

If I understand you correctly, I think the following will work. I made use of the the spread function from the tidyr package, which like dplyr is part of the tidyverse

```
library(tidyr)
itchy %>%
spread(y, freq) %>%
mutate(odds = succes / failure)
type treat failure succes odds
1 Dark A 15 10 0.6666667
2 Dark B 20 3 0.1500000
3 Fair A 17 9 0.5294118
4 Fair B 20 5 0.2500000
```

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**