Daniel - 1 year ago 59
R Question

# Create a column ID on a data frame according with two columns

I have a data frame with 8 variables and i need a new column that represent a combination of two columns to create an ID for each observation. The two columns that i need combine looks like this:

``````Aut<-c("Robert Lucas", "Finn Kydland & Edward Prescott", "Alan Blinder & Ben Bernanke",
"Lars Svensson & Lawrence Christiano & Robert Lucas", "Ben Bernanke")
Year<-c(1976, 1989, 1983, 1985, 1983)
df<-data.frame(Aut, Year)
``````

The ID I will expect is:

``````Aut                                  Year                      ID
Robert Lucas                         1976                    RoLu1976
Finn Kydland & Edward Prescott       1989                  FiKyEdPr1989
Lars Svensson & Lawrence Christiano  1983                LaSvLaChRoLu1983
& Robert Lucas
Alan Blinder & Ben Bernanke          1985                  AlBlBeBe1985
Ben Bernanke                         1983                    BeBe1983
``````

You can try:

``````library(stringr)
# first split the individual names using "&" as pattern.
a <- str_split(df\$Aut, "&")
# Then use lapply, split and sub to split first and last name. Then paste the
# first two letters of each name together.
a1 <- lapply(a, function(x){
x1 <- str_split(str_trim(x), " ")
paste0(unlist(lapply(x1, str_sub,1,2)), collapse="")
})
# Finally add the years. Resulting vector can be saved in df.
df\$ID <- paste0(unlist(a1), df\$Year)
``````

And everything together in one function:

``````foo <- function(a, b){
a <- str_split(a, "&")
a1 <- lapply(a, function(x){
x1 <- str_split(str_trim(x), " ")
paste0(unlist(lapply(x1, str_sub, 1, 2)), collapse="")
})
paste0(unlist(a1), b)
}

foo(df\$Aut, df\$Year)
[1] "RoLu1976"         "FiKyEdPr1989"     "AlBlBeBe1983"     "LaSvLaChRoLu1985" "BeBe1983"
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download