user1165199 - 3 years ago 165
R Question

# Find how long it has been since minimum value

I have a data frame in with a variable, value (which begins with NAs) and time at which that value is valid. For each row I want a new value determining the current

`cummin`
and how long it has been since that minimum.

I have the
`cummin`
part, but note the 'how long' part.

For example, here is my dataframe

``````example <- data.frame(variable = c('a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b'),
time = c(Sys.time(), Sys.time() + 5, Sys.time() + 15, Sys.time() + 34, Sys.time() + 51, Sys.time(), Sys.time() + 10, Sys.time() + 39, Sys.time() + 63, Sys.time() + 79, Sys.time() + 102, Sys.time() + 110, Sys.time() + 115),
value = c(NA, 3, 6, 2, 8, NA, NA, 1, 4, 3, 2, 0, 3))
``````

I am currently using this to get the
`cummin`
of the value by the variable

``````cum_na.rm <- function(x, func){
x[!is.na(x)] <- func(x[!is.na(x)])
x
}
example %>%
group_by(variable) %>%
mutate(current_min = cum_na.rm(value, cummin))
``````

I want to add something to get the time since the current_min to get something along the lines of the following:

``````   variable                time value current_min time_since_min
1        a 2017-10-02 17:13:59    NA          NA             NA
2        a 2017-10-02 17:14:04     3           3         0 secs
3        a 2017-10-02 17:14:14     6           3        10 secs
4        a 2017-10-02 17:14:33     2           2         0 secs
5        a 2017-10-02 17:14:50     8           2        17 secs
6        b 2017-10-02 17:13:59    NA          NA             NA
7        b 2017-10-02 17:14:09    NA          NA             NA
8        b 2017-10-02 17:14:38     1           1         0 secs
9        b 2017-10-02 17:15:02     4           1        24 secs
10        b 2017-10-02 17:15:18     3           1        40 secs
11        b 2017-10-02 17:15:41     2           1        63 secs
12        b 2017-10-02 17:15:49     0           0         0 secs
13        b 2017-10-02 17:15:54     3           0         5 secs
``````

The actual dataframe is over 1,000,000 rows with lots of variables, so anything which is particularly slow may be problematic!

Any help would be appreciated

You can add on a `group_by` for current_min, then add in the `time - time[1]`.
``````example %>%