sarasreddy74 sarasreddy74 - 3 months ago 13
R Question

R Split a column based on pattern

I have a long list of vectors which I got using the str_extract_all().

The head(list) output is as following:

[1] "ARGENTINA"

[[2]]
[1] "BUENOS " "AIRES" "BUENOS " "AIRES" "ARGENTINA"

[[3]]
[1] "ARGENTINA" "ARGENTINA"

[[4]]
[1] "ARGENTINA" "ARGENTINA"

[[5]]
[1] "ARGENTINA"

[[6]]
[1] "ARGENTINA"


I now want to move the output to excel with each element occupying a different column within the same row. example:

p1 p2 p3 p4 p5
ARGENTINA NA NA NA NA
BUENOS AIRES BUENOS AIRES ARGENTINA
ARGENTINA ARGENTINA ARGENTINA NA NA


But I get the following error while trying to do it:


Error in data.frame("ARGENTINA", c("BUENOS ", "AIRES", "BUENOS ",
"AIRES", : arguments imply differing number of rows: 1, 5, 2, 3,
6, 4, 0, 9, 8, 7, 38,


Any help will be appreciated.

Answer

As the lengths of the list elements are different, we may need to pad NA at the end before rbinding by assinging length<- to the maximum length of the elements in the list.

lst <- lapply(lst, trimws)
d1 <- as.data.frame(do.call(rbind, lapply(lst, `length<-`, 
                  max(lengths(lst)))), stringsAsFactors=FALSE)
names(d1) <- paste0("p", seq_along(d1))

If we are using packages, stri_list2matrix is a convenient function from stringi

library(stringi)
stri_list2matrix(lst, byrow=TRUE)

data

lst <- list("ARGENTINA", c("BUENOS ", "AIRES", "BUENOS ", "AIRES", 
   "ARGENTINA"
 ), c("ARGENTINA", "ARGENTINA"), c("ARGENTINA", "ARGENTINA"), 
    "ARGENTINA", "ARGENTINA")