Sambit Nandi Sambit Nandi - 3 months ago 9
R Question

R: Get column name based on row values in R

I have a table like below, would like to crate suggestions based on row value in R.

This is what I have -

id class1 class2 class3 class4
A 0.98 0.48 0.21 0.99
B 0.22 0.31 0.41 0.11
C 0.70 0.81 0.61 0.21


I would like to have two new columns ('sugg1', 'sugg2') that will give the column names of the two top maximum values for each row i.e. for the first row, 0.99 is the maximum value, so its corresponding column name is
class4
, and the next max value is 0.98 for which the column name is
class1
.

id class1 class2 class3 class4 sugg1 sugg2
A 0.98 0.48 0.21 0.99 class4 class1
B 0.22 0.31 0.41 0.11 class3 class2
C 0.70 0.81 0.61 0.21 class2 class1

Answer

We can use apply with MARGIN = 1 to loop over the rows, sort the values in the rows decreasing, get the first 2 (head(...)), transpose the output and create two new columns in the original dataset.

df1[paste0("sugg", 1:2)] <- t(apply(df1[-1], 1, FUN = function(x) names(head(sort(-x),2))))

df1
#  id class1 class2 class3 class4  sugg1  sugg2
#1  A   0.98   0.48   0.21   0.99 class4 class1
#2  B   0.22   0.31   0.41   0.11 class3 class2
#3  C   0.70   0.81   0.61   0.21 class2 class1

This can also be done by melting into 'long' format, subset the first two rows after grouping by 'id'/ordering based on 'value' and then join on the original dataset

library(data.table)#v1.9.7+
df1[dcast(melt(df1, id.var = "id")[order(-value), head(variable,2) , 
       id], id ~paste0("sugg", rowid(id)), value.var = "V1"), on = "id"]
#   id class1 class2 class3 class4  sugg1  sugg2
#1:  A   0.98   0.48   0.21   0.99 class4 class1
#2:  B   0.22   0.31   0.41   0.11 class3 class2
#3:  C   0.70   0.81   0.61   0.21 class2 class1

data

df1 <- structure(list(id = c("A", "B", "C"), class1 = c(0.98, 0.22, 
0.7), class2 = c(0.48, 0.31, 0.81), class3 = c(0.21, 0.41, 0.61
), class4 = c(0.99, 0.11, 0.21)), .Names = c("id", "class1", 
"class2", "class3", "class4"), class = "data.frame",
row.names = c(NA, -3L))
Comments