Chabo Chabo - 1 month ago 9
R Question

Time zone missing from time series

I am in the process of merging two data frames based on date-time, and seem to have run into a snag. The time column in 1 of 2 of the DF's has a timezone stamp:

#Example
"2012-09-28 08:15:00 MDT"


And the other DF time column does not

#Example 2
"2012-09-28 08:15:00"


In my program both of these are POSIXct objects, formatted exactly the same
,besides the timezone stamp. When trying to merge based on the Time columns, NA's appear, b/c they are not recognizing each other.

I have narrowed the problem down to the DF missing the Tz. Something strange is going on. When I have the Data for the datetime Column outside the data frame it reads as such

#Code used to make these values
NewTime<-as.POSIXct(TimeDis$datetime, format="%Y-%m-%d %H:%M")

>NewTime
[1] "2017-08-16 00:00:00 MDT" "2017-08-16 00:15:00 MDT"
[3] "2017-08-16 00:30:00 MDT" "2017-08-16 00:45:00 MDT"


Now when I put this into a data frame with data, the "MDT" does not show up

Discharge_Time<-data.frame(NewTime,DischargeFin)
> Discharge_Time
NewTime DischargeFin
1 2017-08-16 00:00:00 990525.2
2 2017-08-16 00:15:00 990525.2
3 2017-08-16 00:30:00 1000719.2
4 2017-08-16 00:45:00 1000719.2


Even stranger if I call,

>Discharge_Time[1,1]
"2017-08-16 MDT"


I get the MDT back but now no time....

I have no idea what is going on, but am hoping to find a way for the MDT and all the rest to stick around in that data frame so I can successfully merge it with the other DF, which isn't missing anything

Research Done:
How to change a time zone in a data frame?

Changing time zones with POSIXct time series, R

Answer Source

So after many attempts to recreate this error I found it to a culprit of the na.locf function of the package zoo. After padding my data on the interval '15 min' with the pad function from padr, I wanted to replace those N/A values with the previous value in the column. This works well except for the fact it gets rid of the TZ in the date-time. And this is where the problem came from. An example is shown below

library(padr)
library(zoo)

#Dates Missing 8:30 for padding
Dates<-c("2017-08-18 08:00","2017-08-18 08:15","2017-08-18 08:45",
"2017-08-18 09:00")

#Example Data
Data<-c(1,2,3,4)

#Df
Df<-data.frame(Dates, Data)

#Change to POSIXct
Df$Dates<-as.POSIXct(Df$Dates, format="%Y-%m-%d %H:%M")

#We can see now the Dates have been assigned a Timezone
>Dates
[1] "2017-08-18 08:00:00 MDT" "2017-08-18 08:15:00 MDT"
[3] "2017-08-18 08:45:00 MDT" "2017-08-18 09:00:00 MDT"

#Now we Pad
Df<-pad(Df, interval='15 min')

#TZ is still intact (So it's not padr)
>Df[1,1]
[1] "2017-08-18 08:00:00 MDT"

#Here is where the problem lies, in the na.locf function from zoo
library(zoo)
FixDf<-na.locf(Df, option="locf") #replaces N/A with previous value

FixDf[1,1]
[1] "2017-08-18 08:00:00"  #NO TIMEZONE!