MariKo MariKo - 3 months ago 17
R Question

Remove a row depending on a value in a previous row

I have a data frame df:

Event Code
Picture hit
Picture incorrect
Picture hit
Picture hit
Picture incorrect
Picture hit
Picture inocrrect
Picture hit
Picture miss
Picture hit


I want to remove all values after incorrect, so it would look like this:

Event Code
Picture hit
Picture incorrect
Picture hit
Picture incorrect
Picture inocrrect
Picture miss
Picture hit


What is the optimal way to do it?

Answer

It depends on which language you are using. For example, in R or MATLAB, which use indexing, this would be very easy. You would utilise the efficiency of indexing (in R):

Index <- which(DF[,2]=="incorrect")
DF <- DF[-(Index+1),]

Of course you can account for the fact that you could have an "incorrect" at the end, so Index+1 wouldn't make sense. This can be done with the following code added after finding the Index:

If(Index[ length(Index) ] == nrow(DF) ) {Index<-Index[-length(Index)]}

This line of code simply checks if an "incorrect" was found at the end of the data frame as described above. If it is, then we do not remove the term after this as it doesn't exist. This is doen by excluding this index from our vector 'Index'