Therru Therru - 1 year ago 41
R Question

How to Convert a Line into Multiple Columns in R

I have many text files and all these files include a line which starts with the same comment.

"HDR TIME_YMD=2001-02-16 T 00:00:00"
"HDR TIME_YMD=2001-03-18 T 00:00:00"


etc.

I want to be able to change the line into two columns and create a data frame as below.

Year Month
2001 02 (or February)
2001 03 (or March)


I have found similar answers from other questions. With the benefit of extract data between a pattern from a text file in R the code that I have imitated is as follows, but I couldn't code the "Month" column.

DT <- DT[, `:=` (Year = as.numeric(gsub('^.*(\\d+{4}).*','\\1',
grep('HDR TIME_YMD=', txt, value = TRUE))), Month=????


Another answers I am looking for;

I want to convert the data into three data columns as
'Longitude','Latitude','TWC'


Year Month Longitude Latitude TWC
1 2001 02 130.5 -16.5 6.935
2 2001 02 131.5 -16.5 13.912
3 2001 03 132.5 -16.5 13.244
4 2001 03 133.5 -16.5 15.556
5 2001 03 134.5 -16.5 21.380


I am very noob to R and I may need a detailed explanation. Thanks!

Answer Source

If the time stamp is formatted as well as you have shown, you can just split it and use index to get the year and month:

library(data.table)
dt[, c("Year", "Month") := tstrsplit(TimeStamp, "=|-")[2:3]]
dt
#                            TimeStamp Year Month
#1: HDR TIME_YMD=2001-02-16 T 00:00:00 2001    02
#2: HDR TIME_YMD=2001-02-16 T 00:00:00 2001    02
#3: HDR TIME_YMD=2001-02-16 T 00:00:00 2001    02

Where dt is:

dt = data.table(TimeStamp = c("HDR TIME_YMD=2001-02-16 T 00:00:00", 
                              "HDR TIME_YMD=2001-02-16 T 00:00:00", 
                              "HDR TIME_YMD=2001-02-16 T 00:00:00"))
dt
#                             TimeStamp
# 1: HDR TIME_YMD=2001-02-16 T 00:00:00
# 2: HDR TIME_YMD=2001-02-16 T 00:00:00
# 3: HDR TIME_YMD=2001-02-16 T 00:00:00