user299791 user299791 - 1 month ago 6
R Question

How to find percent values by joining to column from another data.table

I have built a data.table dataframe

dt1
like:

State Bounced Opened Delivered
FL 2 1 0
NY 0 1 1


and I have another data.table
dt2
with the total values:

State Total
NY 2
FL 3


I'd like to find the percent frequency for each column in the first data table:

State Bounced Opened Delivered
FL 66.66 33.33 0
NY 0 50 50


If I simply * 100 and divide all the first data table by the Total column of the second data table, I have the result but the first column with State names became NA, which is not good.

If I do like:

dt1[, percen:=Bounced/dt2$Total]


it works but I have to repeat the process for all the columns. Is there a way to replicate to all the columns but the first one?

Second question: what is the best way to check that the Total value used is the right one - i.e. belongs to the right State? What I have done so far is to just order the State column alphabetically, but I am wondering if data.table offers a way to control this while doing the percentages...

Answer

I am not very good with data.table , however, I think this would work.

cbind(State = df1$State, 
      df1[, -1, with = FALSE]/df2$Total[match(df1$State, df2$State)]* 100)

#   State  Bounced   Opened  Delivered
#1:    FL  66.667    33.333         0
#2:    NY  0.000     50.000        50

Considering df1 and df2 are data.table.

This would also answer your second question as it matches the State in df1 with that of df2.