MFR MFR - 6 days ago 6
R Question

Subtracting the last data with zero and the first data with 1 after that, for each group

Sorry if the title looks confusing. In the following data,

mydata<- data.frame(id= c("1","2","1", "1" ,"2" ,"1"), transaction = c (0,0,1,0,1,1) , time = c( 10, 20, 22, 27, 39, 47))


I have some visits ( when transaction = 0) and I have some transactions where (transaction =1). I wish to know the time difference between the last visit and the transaction for each id : For instance, the last time that id one had a visit was in time 10 befor his transaction that happened in time 22 which have 12 unit difference.
This is my data

id transaction time
1 1 0 10
2 2 0 20
3 1 1 22
4 1 0 27
5 2 1 39
6 1 1 47


And I wish to have the following output:

id transaction time dif
1 1 0 10 NA
2 2 0 20 NA
3 1 1 22 12
4 1 0 27 NA
5 2 1 39 19
6 1 1 47 20


I have tried something like that but it gives me something like that which is obviously wrong

library(dplyr)
mydata %>%
group_by( id) %>%

mutate( dif =first(time[transaction == 1])-last(time[transaction == 0]))

Answer

We can use data.table with the assignment (:=) in place to avoid any copying

library(data.table)
setDT(mydata)[, dif:=time[transaction!=0]- time[transaction==0], by = id][transaction==0, dif:= NA][]
Comments