Daniel - 20 days ago 4x
R Question

# Combine the first two letters of each word in a sentence string and a numeric variable

I have a data frame with 8 variables and I need to create a new column that represents a combination of two columns for use as an ID for each observation. The two columns that I need to combine look like this:

Aut <- c("Robert Lucas", "Finn Kydland & Edward Prescott", "Alan Blinder & Ben Bernanke",
"Lars Svensson & Lawrence Christiano & Robert Lucas", "Ben Bernanke")
Year <- c(1976, 1989, 1983, 1985, 1983)
df <- data.frame(Aut, Year)

The resulting ID variable I expect is:

Aut Year ID
Robert Lucas 1976 RoLu1976
Finn Kydland & Edward Prescott 1989 FiKyEdPr1989
Lars Svensson & Lawrence Christiano 1983 LaSvLaChRoLu1983
& Robert Lucas
Alan Blinder & Ben Bernanke 1985 AlBlBeBe1985
Ben Bernanke 1983 BeBe1983

You can try:

library(stringr)
# first split the individual names using "&" as pattern.
a <- str_split(df\$Aut, "&")
# Then use lapply, split and sub to split first and last name. Then paste the
# first two letters of each name together.
a1 <- lapply(a, function(x){
x1 <- str_split(str_trim(x), " ")
paste0(unlist(lapply(x1, str_sub,1,2)), collapse="")
})
# Finally add the years. Resulting vector can be saved in df.
df\$ID <- paste0(unlist(a1), df\$Year)

And everything together in one function:

foo <- function(a, b){
a <- str_split(a, "&")
a1 <- lapply(a, function(x){
x1 <- str_split(str_trim(x), " ")
paste0(unlist(lapply(x1, str_sub, 1, 2)), collapse="")
})
paste0(unlist(a1), b)
}

foo(df\$Aut, df\$Year)
[1] "RoLu1976"         "FiKyEdPr1989"     "AlBlBeBe1983"     "LaSvLaChRoLu1985" "BeBe1983"