SoccerAnalytics26 SoccerAnalytics26 - 2 months ago 14
R Question

Change variable for all further observations

I am trying to create a type of a "switch" in my data where when a specific event happens, a binary indicator will switch from 1 to 0 for all the next observations (data sorted by time). Below is a sample of my data currently.

Time Event ID_1 ID_2 ID_3 ID_4 ID_5
1 0 1 1 1 1 1
2 0 1 1 1 1 1
3 0 1 1 1 1 1
4 0 1 1 1 1 1
5 ID_2 1 1 1 1 1
6 0 1 1 1 1 1
7 ID_3 1 1 1 1 1
8 0 1 1 1 1 1


And then this is what I want:

Time Event ID_1 ID_2 ID_3 ID_4 ID_5
1 0 1 1 1 1 1
2 0 1 1 1 1 1
3 0 1 1 1 1 1
4 0 1 1 1 1 1
5 ID_2 1 0 1 1 1
6 0 1 0 1 1 1
7 ID_3 1 0 0 1 1
8 0 1 0 0 1 1


As you can see, when "EVENT" takes a value other than 0, I want the variable that matches what is in the EVENT value to switch from 1 to 0, and I want it to stay 0 as the observations continue to go on. Any help would be great.

Thanks.

Answer Source

I would do this by using library(data.table), creating a 'lookup' table of when the events occur, and then using a non-equi join to update the values that are after the event

## extract just the rows where the updates occur
eventRows <- df[df$Event != "0", c("Time", "Event")]

library(data.table)

## set as data.tables
setDT(eventRows)
setDT(df)

## melt into long-form
df_melted <- melt(df, id.vars = c("Time", "Event"))

## perform a non-equi join, and update-by-reference the values
## meeting the criteria
df_melted[
  eventRows
  , on = .(variable == Event, Time >= Time)
  , value := 0
]

## put back into wide form
df_res <- dcast(df_melted, formula = Time + Event ~ variable)
df_res
#    Time Event ID_1 ID_2 ID_3 ID_4 ID_5
# 1:    1     0    1    1    1    1    1
# 2:    2     0    1    1    1    1    1
# 3:    3     0    1    1    1    1    1
# 4:    4     0    1    1    1    1    1
# 5:    5  ID_2    1    0    1    1    1
# 6:    6     0    1    0    1    1    1
# 7:    7  ID_3    1    0    0    1    1
# 8:    8     0    1    0    0    1    1

## convert back to a data.frame using
## setDF(df)

Data

df <- read.table(text = "Time Event ID_1 ID_2 ID_3 ID_4 ID_5
1    0     1    1    1    1    1
                 2    0     1    1    1    1    1
                 3    0     1    1    1    1    1
                 4    0     1    1    1    1    1
                 5    ID_2  1    1    1    1    1
                 6    0     1    1    1    1    1
                 7    ID_3  1    1    1    1    1
                 8    0     1    1    1    1    1", header = T)