runjumpfly runjumpfly - 1 month ago 6
R Question

Average number of seconds between two time observations

I have a irregular time index from an xts object. I need to find the average number of seconds between two time observations. This is the my sample data:

dput(tt)
structure(c(1371.25, NA, 1373.95, NA, NA, 1373, NA, 1373.95,
1373.9, NA, NA, 1374, 1374.15, NA, 1374, 1373.85, 1372.55, 1374.05,
1374.15, 1374.75, NA, NA, 1375.9, 1374.05, NA, NA, NA, NA, NA,
NA, NA, 1375, NA, NA, NA, NA, NA, 1376.35, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, 1376.25, NA, 1378, 1376.5, NA, NA, NA, 1378,
1378, NA, NA, 1378.8, 231.9, 231.85, NA, 231.9, 231.85, 231.9,
231.8, 231.9, 232.6, 231.95, 232.35, 232, 232.1, 232.05, 232.05,
232.05, 231.5, 231.3, NA, NA, 231.1, 231.1, 231.1, 231, 231,
230.95, 230.6, 230.6, 230.7, 230.6, 231, NA, 231, 231, 231.45,
231.65, 231.4, 231.7, 231.3, 231.25, 231.25, 231.4, 231.4, 231.85,
231.75, 231.5, 231.55, 231.35, NA, 231.5, 231.5, NA, 231.5, 231.25,
231.15, 231, 231, 231, 231.05, NA), .Dim = c(60L, 2L), .indexCLASS = c("POSIXct",
"POSIXt"), tclass = c("POSIXct", "POSIXt"), .indexTZ = "Asia/Calcutta", tzone = "Asia/Calcutta", index = structure(c(1459482299,
1459482301, 1459482302, 1459482303, 1459482304, 1459482305, 1459482306,
1459482307, 1459482309, 1459482310, 1459482311, 1459482312, 1459482314,
1459482315, 1459482316, 1459482317, 1459482318, 1459482319, 1459482320,
1459482321, 1459482322, 1459482323, 1459482324, 1459482326, 1459482328,
1459482329, 1459482330, 1459482331, 1459482332, 1459482336, 1459482337,
1459482338, 1459482339, 1459482342, 1459482344, 1459482346, 1459482347,
1459482348, 1459482349, 1459482590, 1459482591, 1459482594, 1459482595,
1459482596, 1459482597, 1459482598, 1459482599, 1459482602, 1459482603,
1459482604, 1459482609, 1459482610, 1459482611, 1459482612, 1459482613,
1459482618, 1459482619, 1459482620, 1459482622, 1459482628), tzone = "Asia/Calcutta", tclass = c("POSIXct",
"POSIXt")), .Dimnames = list(NULL, c("A", "B")), class = c("xts",
"zoo"))


This is my attempt:

difftime(index(tt),index(lag.xts(tt, k=1)), units=c("auto"))
Time differences in secs
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
attr(,"tclass")
[1] "POSIXct" "POSIXt"


Any help is highly appreciated.


Edit:


Based on the answers, I have made the following code. The code is meant to calculate mean number of seconds for A and B every day.

But the code takes the index of tt instead of A or B and so the results of A and B is same.

fun.time= function(x) mean(diff(time(x)))
df.time<-do.call(rbind, lapply(split(tt, "days"), FUN=function (x) {do.call(cbind, lapply(as.list(x), fun.time))}))


dput(df.time)
structure(c(5.57627118644068, 5.57627118644068), .Dim = 1:2, .Dimnames = list(
NULL, c("A", "B")))

Answer

Try this:

mean(diff(as.numeric(time(tt))))
##  5.5763

Without as.numeric it gives a difftime object in seconds in this case; however, the units of seconds are not assured -- it could, for example, return minutes for another input. To avoid this unpredictability it seems easier to just avoid difftime objects and convert the times to numeric seconds as above.

Regarding the edit at the end of the question it is not completely clear what the aim is.

1) Both columns are from the same object so they necessarily have the same mean difference in their times and the above is that common mean difference.

2) If what is wanted is to calculate the mean difference of the times of each of the two series without the NA values then:

sapply(as.list(tt), function(x) mean(diff(as.numeric(time(na.omit(x))))))

giving:

      A       B 
14.9545  6.2115 

3) If the aim is to do the above by date then first create a test object that has more than one date and do this:

# test input
tt2 <- tt
time(tt2) <- time(tt) + seq(1, 24*60*60, length = 60)

do.call(cbind, lapply(as.list(tt2), function(x) {
  times <- time(na.omit(x))
  aggregate(zoo(as.numeric(times), format(times)), as.Date, function(x) mean(diff(x)))
}))

giving the following zoo series:

                A      B
2016-04-01 3029.0 1648.9
2016-04-02 5416.1 1633.0

Update: Have added as.numeric to ensure result is in seconds and have added a response to the Edit section of the question. Have also modified the aggregate statement to use format(times) to avoid time zone problems.

Comments