boshek boshek - 25 days ago 9
R Question

Combining gsub calls and remove character after last instance of a string

I have the following string:

time <- "2017-05-30T09:20:00-08:00"


I was to use
gsub
to produce this:

"2017-05-30 09:20:00"


Here is what I have so far:

time2 <- gsub("T", " ", time)
gsub("\\-.*", "", time2)


Two questions -


  1. How do remove all characters after the last instance of
    -
    ?

  2. How do I combine these two statements into one?


Answer Source

Use a single call to a sub with a spelled out regex to capture the parts you are interested in, and just match everything else. Then, use replacement backreferences \1 and \2 in the replacement pattern to re-insert those two captured subparts:

^(\d{4}-\d{2}-\d{2})T(\d{2}:\d{2}:\d{2}).*

See the regex demo.

Details:

  • ^ - start of a string
  • (\d{4}-\d{2}-\d{2}) - Group 1: 4 digits, -, 2 digits, - and then 2 digits
  • T - a T letter
  • (\d{2}:\d{2}:\d{2}) - Group 2: 2 digis, :, 2 digits, : and 2 digits
  • .* - any 0+ chars up to the string end.

R online demo:

time_s <- "2017-05-30T09:20:00-08:00"
sub("^(\\d{4}-\\d{2}-\\d{2})T(\\d{2}:\\d{2}:\\d{2}).*", "\\1 \\2", time_s)
## => [1] "2017-05-30 09:20:00"