MariKo MariKo - 3 months ago 7
R Question

Create a new column depending on different values in other columns R

I have a big data set that in its short version looks like this:

> df
Stimulus TimeDiff
S102 10332.4
S 66 1095.4
S103 2987.8
S 77 551.4
S112 3015.2
S 66 566.6
S114 5999.8
S 88 403.8
S104 4679.4
S 88 655.2


I want to create a new column df$Accuracy where I need to assign correct, incorrect responses, and misses depending on certain values (only S 88, S 66, S 77) in the df$Stimulus and in df$TimeDiff. For example, if S 88 is preceded by S114 or S104 and df$TimeDiff for that row is less than 710 then assign "incorrect" in df$Accuracy. So the data set would look like this:

> df
Stimulus TimeDiff Accuracy
S102 10332.4 NA
S 66 1095.4 NA
S103 2987.8 NA
S 77 551.4 NA
S112 3015.2 NA
S 66 566.6 NA
S114 5999.8 NA
S 88 403.8 incorrect
S104 4679.4 NA
S 88 655.2 incorrect


What is the best way to do it?

Answer

You can use ifelse and lag function from dplyr,

library(dplyr) 
df$Accuracy <- with(df, ifelse(Stimulus %in% c('S88', 'S66', 'S77') &
                                   lag(Stimulus) %in% c('S114', 'S104') & 
                                           TimeDiff < 710, 'incorrect', NA))
df
#   Stimulus TimeDiff  Accuracy
#1      S102  10332.4      <NA>
#2       S66   1095.4      <NA>
#3      S103   2987.8      <NA>
#4       S77    551.4      <NA>
#5      S112   3015.2      <NA>
#6       S66    566.6      <NA>
#7      S114   5999.8      <NA>
#8       S88    403.8 incorrect
#9      S104   4679.4      <NA>
#10      S88    655.2 incorrect
Comments