user6016731 user6016731 - 4 months ago 24
R Question

Dplyr to summarize columns

I have a dataset

company_category_list Cluster
Biotechnology 1
Software 2
Biotechnology|Search 1
Biotechnology 1
Biotechnology 1
Enterprise Software 3
Software 2


I want to get a count of the 1st column grouped by the column Cluster so used the following code:

library(dplyr)
CountSummary <-SFBay_2012 %>%
group_by(Cluster) %>%
summarise(company_category_list_Count = count_(company_category_list))


But getting the following error:

Error: no applicable method for 'group_by_' applied to an object of class "factor"


Can anybody help out?
Thanks in advance!!

Answer

I guess we need

SFBay_2012 %>%
        group_by(Cluster) %>% 
        count(company_category_list)   
#   Cluster company_category_list     n
#    <int>                 <chr> <int>
#1       1         Biotechnology     3
#2       1  Biotechnology|Search     1
#3       2              Software     2
#4       3   Enterprise Software     1

Or

SFBay_2012 %>% 
      count(Cluster, company_category_list)
#  Cluster company_category_list     n
#    <int>                 <chr> <int>
#1       1         Biotechnology     3
#2       1  Biotechnology|Search     1
#3       2              Software     2
#4       3   Enterprise Software     1

Or

SFBay_2012 %>%
        group_by(Cluster, company_category_list) %>% 
        tally()
#   Cluster company_category_list     n
#     <int>                 <chr> <int>
#1       1         Biotechnology     3
#2       1  Biotechnology|Search     1
#3       2              Software     2
#4       3   Enterprise Software     1

Or

SFBay_2012 %>%
     group_by(Cluster, company_category_list) %>%
     summarise(n = n())