user1017373 user1017373 - 1 month ago 7
Python Question

How to append a suffix for specific column names of a dataframe from a list

I wanted to append a suffix or prefix to certain column names of df1 based on the names in df2.
my df1 looks like this,

AE02 AE03 AE04 AE05 AE06 AE07 AE08 AE09 AE10 AE11 AE12
11.9619362364 18.5701402709 42.2010838789 28.0025053738 19.5589170223 18.1459582989 16.5292369479 32.4885640738 34.0342144643 31.6971000153 44.932255488
2.9904840591 3.9793157723 0 0 1.7780833657 1.7281865047 13.7743641233 4.3318085432 0 17.067669239 0
0 0 0 0 2.6671250485 0 4.5914547078 0 0 0 2.1396312137


and df2 looks like,

V1
AE06
AE08
AE09
AE12


I could replace those column name with new name as follows,

colnames(df1)[which(colnames(df1) %in% df2$V1 )] <- "DMR"


But I am looking for a solution where I can append as prefix to the column name instead of replacing it,
for instance, my column names should like this,

AE02 AE03 AE04 AE05 DMR_AE06 AE07 DMR_AE08 DMR_AE09 AE10 AE11 DMR_AE12


Any suggestions and help is well apreciated

Answer

Pandas solution:

You can use numpy.where with mask by Index.isin:

print (df.columns.isin(df2.V1))
[False False False False  True False  True  True False False  True]

df.columns = np.where(df.columns.isin(df2.V1), 'DMR_' + df.columns, df.columns)
print (df)
        AE02       AE03       AE04       AE05   DMR_AE06       AE07  \
0  11.961936  18.570140  42.201084  28.002505  19.558917  18.145958   
1   2.990484   3.979316   0.000000   0.000000   1.778083   1.728187   
2   0.000000   0.000000   0.000000   0.000000   2.667125   0.000000   

    DMR_AE08   DMR_AE09       AE10       AE11   DMR_AE12  
0  16.529237  32.488564  34.034214  31.697100  44.932255  
1  13.774364   4.331809   0.000000  17.067669   0.000000  
2   4.591455   0.000000   0.000000   0.000000   2.139631  
Comments