user8420488483439 user8420488483439 - 5 days ago 5
R Question

dplyr group_by and filter

Consider the following dplyr query

> mpg %>% group_by(class) %>% summarise(n())


The output is

# A tibble: 7 x 2
class n()
<chr> <int>
1 2seater 5
2 compact 47
3 midsize 41
4 minivan 11
5 pickup 33
6 subcompact 35
7 suv 62


Now, I want to filter the result as follows:

> mpg %>% group_by(class) %>% filter(hwy==21) %>% summarise(n())


That is, I want to show the number of car classes having a highway mileage 21. Here is the result:

# A tibble: 2 x 2
class n()
<chr> <int>
1 minivan 1
2 subcompact 1


This is the expected result, but what I want to see instead is all the classes again, and in case a class does not have a car with a highway mileage of 21, then n() should be reported as 0. How can I do this?

In other words, I want the dplyr query that shows the following output:

# A tibble: 7 x 2
class n()
<chr> <int>
1 2seater 0
2 compact 0
3 midsize 0
4 minivan 1
5 pickup 0
6 subcompact 1
7 suv 0


where n() is the number car classes with a highway mileage of 21.

Is this possible?

Answer

Try this

mpg %>% mutate(k=(hwy==21)) %>% group_by(class) %>%
   summarise(n=sum(k))
Comments