solomo31 - 1 year ago 84

R Question

I'm pretty new to R and I'm trying to figure out how to write code to get the frequency for multiple columns based on different conditions.

**Example Data**

`ID Group Age Gender Total_T Neg_Mood_T Interpersonal_Prob_T`

6000-01-00 0 9 1 44.00 49.00 42.00 44.00 48.00 40.00

6000-02-00 0 12 1 53.00 54.00 42.00 59.00 52.00 51.00

6000-03-00 0 7 2 72.00 50.00 56.00 58.00 81.00 84.00

6000-04-00 0 7 1 41.00 44.00 49.00 47.00 41.00 40.00

6000-05-00 0 9.5 1 38.00 44.00 42.00 39.00 41.00 40.00

6000-06-00 1 8 1 39.00 38.00 57.00 39.00 41.00 40.00

6000-07-00 1 9 1 38.00 44.00 42.00 39.00 41.00 40.00

6000-08-00 1 18 1 41.00 44.00 44.00 48.00 41.00 40.00

6000-09-00 1 9 2 58.00 54.00 45.00 47.00 69.00 56.00

6000-10-00 1 11 2 42.00 40.00 45.00 47.00 46.00 40.00

So, I began with a simple code to figure out the frequency of what occurs in a variable based on some condition in this code:

condition 1:

`Total_T <- sum(data$Total_T[data$Group==0]>=60, na.rm=TRUE)`

condition 1:

`Total_T <- sum(data$Total_T[data$Group==0]<60, na.rm=TRUE)`

However, I need to repeat this code a bunch more times for different variables and different conditions (i.e. condition 1 would be repeated for 4 more variables as would condition 2 and so forth) and I would like to figure out how to make it more efficient.

So, I'm hoping to create a code that will return the frequency of Total_T, Neg_Mood_T etc based on the conditions I place on Group, Age and Gender.

I've tried to use

`data.frame(table())`

`ddply`

Thanks !

Answer Source

We can use `subset`

to get the part of the data we need, then `sum`

:

```
x1 <- subset(data, Group== 0 & Gender == 1, select="Total_T")
sum(x1[x1 >= 60], na.rm=TRUE)
sum(x1[x1 < 60], na.rm=TRUE)
#Wrapped in a function
fun <- function(cols) {
x1 <- subset(data, Group== 0 & Gender == 1, select=cols)
sum(x1[x1 >= 60], na.rm=TRUE)
}
fun("Total_T")
[1] 176
fun("Neg_Mood_T")
[1] 191
```

If you would like to get all the columns in one shot, you can use:

```
library(dplyr)
data %>% filter(Group == 0 & Gender == 1) %>%
summarise_at(-(1:4), funs(sum(.[. < 60])))
# Total_T Neg_Mood_T Interpersonal_Prob_T col7 col8 col9
# 1 176 191 175 189 182 171
```

**Edit**

There is a difference between summing the values of `Total_T`

that fit the conditions and summing the number of times a value fits the description. We can show with an example:

```
x <- 1:10
#condition
x > 5
#1. sum values fitting the condition
sum(x[x > 5])
[1] 40
#2. sum number of times a value fits condition
sum(x > 5)
[1] 4
```