shoorideh - 9 months ago 68

R Question

I was wondering how I can use loop function to calculate

`apply(table(data$people,data$event),2,function(x) mean(x[x>0]))`

for each level of Colour. I mean, I want to calculate the above function for each level of Colour.

`people <-c("R1","R2","R2","R3","R3","R4","R4","R4","R4","R3","R3","R3","R3","R2","R2","R2","R5","R6")`

event<-c("a","b","b","M","s","f","y","b","a","a","a","a","s","c","c","b","m","a")

Colour<-c("red","blue","green","pink","red","blue","grean","red","red","black","pink","blue","blue","green","blue","green","green","red")

data<-data.frame(people,event,Colour)

Answer

To do your function to each group, let's first make it a function:

```
your_function = function(data) {
apply(table(data$people,data$event),2,function(x) mean(x[x>0]))
}
```

Then we can split your data up by Colour and apply your function to each sub-data-frame:

```
dat_split = split(data, f = data$Colour)
results = lapply(dat_split, your_function)
results
# $black
# a b c f m M s y
# 1 NaN NaN NaN NaN NaN NaN NaN
#
# $blue
# a b c f m M s y
# 1 1 1 1 NaN NaN 1 NaN
#
# $grean
# a b c f m M s y
# NaN NaN NaN NaN NaN NaN NaN 1
# ...
```

Personally, I don't find this very friendly. `data.table`

and `dplyr`

make doing things to subsets of data frames easy. I would have used `dplyr`

from the start, like this:

```
library(dplyr)
data %>% group_by(people, Colour, event) %>%
summarize(n = n()) %>%
group_by(Colour, event) %>%
summarize(mean = mean(n)) %>%
tidyr::spread(key = event, value = mean)
# Source: local data frame [6 x 9]
#
# Colour a b c f m M s y
# (fctr) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl)
# 1 black 1 NA NA NA NA NA NA NA
# 2 blue 1 1 1 1 NA NA 1 NA
# 3 grean NA NA NA NA NA NA NA 1
# ...
```