Haboryme Haboryme - 2 months ago 8
R Question

Concatenate values of 2 columns into 1 (equivalent of R's paste)

Example data in python 3.5:

import pandas as pd
df=pd.DataFrame({"A":["x","y","z","t","f"],
"B":[1,2,1,2,4]})


This gives me a dataframe with 2 columns "A" and "B".
I then want to add a third column "C" that contains the value of "A" and "B" concatenated and separated by "_".

Following the suggestion from this answer I can do it like this.

for i in range(0,len(df["A"])):
df.loc[i,"C"]=df.loc[i,"A"]+"_"+str(df.loc[i,"B"])


I get the result I want but it seems convoluted for such a simple task.

In R this would be done like this:

df<-data.frame(A=c("x","y","z","t","f"),
B=c(1,2,1,2,4))
df$C<-paste(df$A,df$B,sep="_")


Another thread suggested the use of the "%" operator but I can't get it to work.

Is there a better alternative?

Answer

You can just add the columns together but for 'B' you need to cast the type using astype(str):

In [115]:
df['C'] = df['A'] + '_' + df['B'].astype(str)
df

Out[115]:
   A  B    C
0  x  1  x_1
1  y  2  y_2
2  z  1  z_1
3  t  2  t_2
4  f  4  f_4

This is a vectorised approach and will scale much better than looping over every row for large dfs

Comments