Daniel Daniel - 3 months ago 12
R Question

Create a column ID on a data frame according with two columns

I have a data frame with 8 variables and i need a new column that represent a combination of two columns to create an ID for each observation. The two columns that i need combine looks like this:

Aut<-c("Robert Lucas", "Finn Kydland & Edward Prescott", "Alan Blinder & Ben Bernanke",
"Lars Svensson & Lawrence Christiano & Robert Lucas", "Ben Bernanke")
Year<-c(1976, 1989, 1983, 1985, 1983)
df<-data.frame(Aut, Year)

The ID I will expect is:

Aut Year ID
Robert Lucas 1976 RoLu1976
Finn Kydland & Edward Prescott 1989 FiKyEdPr1989
Lars Svensson & Lawrence Christiano 1983 LaSvLaChRoLu1983
& Robert Lucas
Alan Blinder & Ben Bernanke 1985 AlBlBeBe1985
Ben Bernanke 1983 BeBe1983


You can try:

# first split the individual names using "&" as pattern.
a <- str_split(df$Aut, "&")
# Then use lapply, split and sub to split first and last name. Then paste the 
# first two letters of each name together. 
a1 <- lapply(a, function(x){
  x1 <- str_split(str_trim(x), " ")
  paste0(unlist(lapply(x1, str_sub,1,2)), collapse="")
# Finally add the years. Resulting vector can be saved in df. 
df$ID <- paste0(unlist(a1), df$Year)

And everything together in one function:

foo <- function(a, b){
   a <- str_split(a, "&")
   a1 <- lapply(a, function(x){
           x1 <- str_split(str_trim(x), " ")
           paste0(unlist(lapply(x1, str_sub, 1, 2)), collapse="")
   paste0(unlist(a1), b) 

foo(df$Aut, df$Year)
[1] "RoLu1976"         "FiKyEdPr1989"     "AlBlBeBe1983"     "LaSvLaChRoLu1985" "BeBe1983"