FiofanS FiofanS - 1 month ago 17
R Question

Function for summing up minutes across time windows

I am creating the function that sums up the difference between

time_1
and
time_2
distributed across time windows (from 8 till 9, from 9 till 10).

This is my sample data (please notice that
time_2
is always greater than
time_1
):

time_1 = c("08:20", "08:58", "09:30")
time_2 = c("08:50", "09:20", "09:48")
df = data.frame(time_1, time_2)


I've written the following function (it's not finished yet):

getTimePerIntervals <- function(df) {
time_1_hour = as.numeric(substr(df$time_1,1,2))
time_1_minutes = as.numeric(substr(df$time_1,4,5))
time_2_hour = as.numeric(substr(df$time_2,1,2))
time_2_minutes = as.numeric(substr(df$time_2,4,5))

for (row in data_frame(time_1_hour,time_1_minutes,time_2_hour,time_2_minutes)){
wt_8 = 0:
wt_9 = 0
if (row['time_1_hour']==8 & row['time_2_hour']==8)
{
wt_8 = row['time_2_minutes'] - row['time_1_minutes']
}
else if (row['time_1_hour']==9 & row['time_2_hour']==9)
{
wt_9 = row['time_2_minutes'] - row['time_1_minutes']
}
else if (row['time_1_hour']==8 & row['time_2_hour']==9)
{
wt_8 = (60 - row['time_1_hour'])
wt_9 = row['time_1_minutes']
}
# how to put wt_8 and wt_9 as columns of df?
df
}


My questions are the following:


  1. How to convert
    wt_8
    and
    wt_9
    to columns of
    df
    ? Here
    wt_8

    means a time window from 8 to 9, and
    wt_9
    means a time window from
    9 to 10.(PLEASE NOTICE THAT I WANT TO HAVE THESE VARIABLES, NOT JUST
    OVERALL TIME DIFFERENCE)

  2. Is there any more flexible way to do the same thing? For instance, imagine that the number of time windows is greater than 2, then maybe it is better to apply "less-manual" approach...


Answer

Here is a direct approach.

time_1 = c("08:20", "08:58", "09:30") 
time_2 = c("08:50", "09:20", "09:48") 
df = data.frame(time_1, time_2)
time_1_hour = as.numeric(substr(df$time_1,1,2))
time_1_minutes = as.numeric(substr(df$time_1,4,5))
time_2_hour = as.numeric(substr(df$time_2,1,2))
time_2_minutes = as.numeric(substr(df$time_2,4,5))

window_names <- seq(min(time_1_hour), max(time_2_hour))
window_names <- paste(window_names,"-", window_names+1, sep="")
window_diffs <- matrix(rep(0, length(window_names)*nrow(df)), ncol=length(window_names))
colnames(window_diffs) <- window_names

for (i in seq.int(length(time_1_hour)))  {
    # first hour
    if(time_1_hour[i] < time_2_hour[i]) {
        wname <- paste(time_1_hour[i], "-", time_1_hour[i]+1, sep="")
        window_diffs[i, wname] <- 60 - time_1_minutes[i]
    }

    # full hours, not tested
    if(time_1_hour[i]+1 <= time_2_hour[i]-1) {
        wnames <- seq(time_1_hour[i]+1, time_2_hour[i]-1)
        wnames <- paste(wnames, "-", wnames+1, sep="")
        window_diffs[i, wnames] <- 60
    }

    # last hour
    if(time_1_hour[i] <= time_2_hour[i]) wname <- paste(time_2_hour[i], "-", time_2_hour[i]+1, sep="")
    if(time_1_hour[i] == time_2_hour[i] && time_1_minutes[i] < time_2_minutes[i]) 
        window_diffs[i, wname] <- time_2_minutes[i] - time_1_minutes[i]
    if(time_1_hour[i] < time_2_hour[i]) 
        window_diffs[i, wname] <- time_2_minutes[i]
}

df <- cbind(df, window_diffs)
return(df)

I believe this could be improved in terms of speed.