Mehdi Farhangian Mehdi Farhangian - 2 months ago 17
R Question

Count rows that do not contain a string

I have the following dataset

#df
Factors Transactions
a,c 1
b 0
c 0
d,a 0
a 1
a 0
b 1


I'd like to know how many times we did not have a factor and we had a transaction. So, my desirable output is as follows:

#desired output
Factors count
a 1
b 2
c 2
d 3


For instance, only one time we didn't had
a
and we had a transaction (i.e. only in the last row).

There are many ways to know how many times we had each factor and we had transactions. For instance I tried this one:

library(data.table)

setDT(df)[, .(Factors = unlist(strsplit(as.character(Factors), ","))),
by = Transactions][,.(Transactions = sum(Transactions > 0)), by = Factors]


But I wish to count how many times we didn't have a factor and we had transaction.

Thanks in advance.

Answer

You can calculate the opposite, i.e, how many times the factor has a transaction and then the difference between the total transactions and transactions for each individual factor would be what you are looking for:

library(data.table)
total <- sum(df$Transactions > 0)

(setDT(df)[, .(Factors = unlist(strsplit(as.character(Factors), ","))), Transactions]
          [, total - sum(Transactions > 0), Factors])

#   Factors V1
#1:       a  1
#2:       c  2
#3:       b  2
#4:       d  3
Comments