wetcoaster wetcoaster - 1 year ago 217
R Question

Mutate Cumsum with Previous Row Value

I am trying to run a cumsum on a data frame on two separate columns. They are essentially tabulation of events for two different variables. Only one variable can have an event recorded per row in the data frame. The way I attacked the problem was to create a new variable, holding the value ‘1’, and create two new columns to sum the variables totals. This works fine, and I can get the correct total amount of occurrences, but the problem I am having is that in my current ifelse statement, if the event recorded is for variable “A”, then variable “B” is assigned 0. But, for every row, I want to have the previous variable’s value assigned to the current row, so that I don’t end up with gaps where it goes from 1 to 2, to 0, to 3.

I don't want to run summarize on this either, I would prefer to keep each recorded instance and run new columns through mutate.


Event Value Variable Total.A Total.B
1 1 A 1 0
2 1 A 2 0
3 1 B 0 1
4 1 A 3 0


Event Value Variable Total.A Total.B
1 1 A 1 0
2 1 A 2 0
3 1 B 2 1
4 1 A 3 1


Answer Source

You can use the property of booleans that you can sum them as ones and zeroes. Therefore, you can use the cumsum-function:

DF$Total.A <- cumsum(DF$variable=="A")

Or as a more general approach, provided by @Frank you can do:

uv = unique(as.character(DF$Variable))
DF[, paste0("Total.",uv)] <- lapply(uv, function(x) cumsum(DF$V == x)) 
