Palash Siddamsettiwar Palash Siddamsettiwar - 2 months ago 7
R Question

R- Iteratively combine consecutive elements of a character vector until an empty string element is reached

I have a character vector made up of long strings (alphanumeric + special characters) such as the one described below.

txt <- c(
"Spicy jalapeno bacon ipsum dolor amet",
"tenderloin. pariatur quis",
"",
"consequat pancetta jerky",
"porchetta non chuck exercitation",
"laborum labore ball tip.",
"",
"",
"Duis swine turkey kielbasa. Strip ",
"steak ribeye laboris,"
)


Output needed is

> txt
[1] "Spicy jalapeno bacon ipsum dolor amet tenderloin. pariatur quis"
[2] "consequat pancetta jerky porchetta non chuck exercitation laborum labore ball tip."
[3] "Duis swine turkey kielbasa. Strip steak ribeye laboris,"


Things to consider:

1. The empty string element/s act as linebreakers. They could be more than one consecutively.

2. On joining two elements together, a space needs to be added in between.

Answer

One of a plethora of ways to do this:

library(dplyr)
library(purrr)

data_frame(txt=txt, grp=cumsum(txt=="")) %>% 
  group_by(grp) %>% 
  do(data_frame(joined=paste0(.$txt, collapse=" "))) %>% 
  mutate(joined=trimws(joined)) %>% 
  filter(joined != "") %>% 
  ungroup() %>% 
  select(joined) %>% 
  flatten_chr()
## [1] "Spicy jalapeno bacon ipsum dolor amet tenderloin. pariatur quis"                   
## [2] "consequat pancetta jerky porchetta non chuck exercitation laborum labore ball tip."
## [3] "Duis swine turkey kielbasa. Strip  steak ribeye laboris,"