Tyler Durden Tyler Durden - 21 days ago 4
R Question

How to handle blank items when converting dates in R

I have a csv download of data from a Management Information system. There are some variables which are dates and are written in the csv as strings of the format "2012/11/16 00:00:00".

After reading in the csv file, I convert the date variables into a date using the function as.Date(). This works fine for all variables that do not contain any blank items.

For those which do contain blank items I get the following error message:
"character string is not in a standard unambiguous format"

How can I get R to replace blank items with something like "0000/00/00 00:00:00" so that the as.Date() function does not break? Are there other approaches you might recommend?

Answer

If they're strings, does something as simple as

mystr <- c("2012/11/16 00:00:00","   ","")
mystr[grepl("^ *$",mystr)] <- NA
as.Date(mystr)

work? (The regular expression "^ *$" looks for strings consisting of the start of the string (^), zero or more spaces (*), followed by the end of the string ($). More generally I think you could use "^[[:space:]]*$" to capture other kinds of whitespace (tabs etc.)

Comments