Therru Therru - 2 months ago 7
R Question

How to Convert a Line into Multiple Columns in R

I have many text files and all these files include a line which starts with the same comment.

"HDR TIME_YMD=2001-02-16 T 00:00:00"
"HDR TIME_YMD=2001-03-18 T 00:00:00"


etc.

I want to be able to change the line into two columns and create a data frame as below.

Year Month
2001 02 (or February)
2001 03 (or March)


I have found similar answers from other questions. With the benefit of extract data between a pattern from a text file in R the code that I have imitated is as follows, but I couldn't code the "Month" column.

DT <- DT[, `:=` (Year = as.numeric(gsub('^.*(\\d+{4}).*','\\1',
grep('HDR TIME_YMD=', txt, value = TRUE))), Month=????


Another answers I am looking for;

I want to convert the data into three data columns as
'Longitude','Latitude','TWC'


Year Month Longitude Latitude TWC
1 2001 02 130.5 -16.5 6.935
2 2001 02 131.5 -16.5 13.912
3 2001 03 132.5 -16.5 13.244
4 2001 03 133.5 -16.5 15.556
5 2001 03 134.5 -16.5 21.380


I am very noob to R and I may need a detailed explanation. Thanks!

Answer

If the time stamp is formatted as well as you have shown, you can just split it and use index to get the year and month:

library(data.table)
dt[, c("Year", "Month") := tstrsplit(TimeStamp, "=|-")[2:3]]
dt
#                            TimeStamp Year Month
#1: HDR TIME_YMD=2001-02-16 T 00:00:00 2001    02
#2: HDR TIME_YMD=2001-02-16 T 00:00:00 2001    02
#3: HDR TIME_YMD=2001-02-16 T 00:00:00 2001    02

Where dt is:

dt = data.table(TimeStamp = c("HDR TIME_YMD=2001-02-16 T 00:00:00", 
                              "HDR TIME_YMD=2001-02-16 T 00:00:00", 
                              "HDR TIME_YMD=2001-02-16 T 00:00:00"))
dt
#                             TimeStamp
# 1: HDR TIME_YMD=2001-02-16 T 00:00:00
# 2: HDR TIME_YMD=2001-02-16 T 00:00:00
# 3: HDR TIME_YMD=2001-02-16 T 00:00:00
Comments