arvi1000 arvi1000 - 3 months ago 8
R Question

round.POSIXt() returns list inside data.table

I have a date-time column stored as character in a

data.table
. When I convert to POSIXct and then try rounding to date-only, I get weird results.

library(data.table)
library(lubridate)

# suppose I have these dates, in a data.table
date_chr <- c("2014-04-09 8:37 AM", "2014-09-16 6:04 PM",
"2014-09-30 3:26 PM", "2014-11-13 12:47 PM",
"2014-11-05 12:25 PM")
dat <- data.table(date_chr)

# I convert to POSIXct...
dat[, my_date := ymd_hm(date_chr)]

# ...and I want to round to date only, but this doesn't work
dat[, date_only := round(my_date, 'days')] # why does this return a list?
dat[, date_only := trunc(my_date, 'days')] # this too


class(dat$date_only)
is
list
, and I get this warning message

# Warning message:
# In `[.data.table`(dat, , `:=`(date_only, round(my_date, "days"))) :
# Supplied 9 items to be assigned to 5 items of column 'date_only' (4 unused)


Meanwhile, this works fine!

dat_df <- data.frame(date_chr, stringsAsFactors = F)
dat_df$my_date <- ymd_hm(dat_df$date_chr)
dat_df$date_only <- round(dat_df$my_date, 'days')


class(dat_df$date_only)
is
POSIXlt, POSIXt
, as desired.

My question is, why is this and how can I avoid the issue when using
data.table
? There are work-arounds, like truncating the time portion of
date_chr
before converting, but seems like
round.POSIXt()
ought to work.

Thanks for any thoughts.

Answer

Already pretty well answered in comments by @SymbolixAU.
Addressing your question about data.frame/data.frame difference on that matter.
Major difference comes from the fact that POSIXlt takes much more memory than POSIXct, and data.table do care about memory.

object.size(Sys.time())
#312 bytes
object.size(as.POSIXlt(Sys.time()))
#2144 bytes

Important to know is that you can still use POSIXlt data type (and its methods) in data.table j argument, just make sure to convert it to POSIXct when assigning to a column.

Comments