Chris Chris - 1 month ago 8
R Question

Summarize number of cases in category and calculate new column

I hava a dataset with the following structure

zip code |type of crime
------ |------
1002 |crime1
1002 |crime1
1002 |crime2
1002 |crime1
9210 |crime1
9210 |crime1
9210 |crime2
9210 |crime2


I also have a list of minimum sentences for each crime

crime | minimum sentence (days)
------| ------
crime1|10
crime2|15


Using these two tables, I would like to do the following:


  1. calculate the total of each crime in each neighborhood

    zip code | crime |number of crimes
    ------ | ------ |-----
    1002 | crime1 | 3
    1002 | crime2 | 1
    9210 | crime1 | 2
    9210 | crime2 | 2

  2. multiply each crime by it's minimum sentence and then calculate the total of days by neighborhood.

    zip | crime | crimexdays
    ---- | ------ | -----
    1002 | crime1 | 30
    1002 | crime2 | 15
    9210 | crime1 | 20
    9210 | crime2 | 30



I'd really appreciate any help here. Cheers!!

Answer

Get the frequency with count, left_join with second dataset and trasmute to create the new column

df1 %>% 
     count(zipcode, typeofcrime) %>% 
     left_join(., df2, by = c("typeofcrime" = "crime")) %>% 
     transmute(typeofcrime, crimexsentence = n*minimumsentence) 
#     zipcode typeofcrime crimexsentence
#     <int>       <chr>          <int>
#1    1002      crime1             30
#2    1002      crime2             15
#3    9210      crime1             20
#4    9210      crime2             30
Comments