Jack Jack - 9 months ago 26
R Question

How to save row names when selecting each column independently instead of row number?

I asked this question a couple days ago and @Juan Bosco helped me and suggested the code which works perfectly and selects top n values from each column. But turns out I need the names of each selected row for each column, so in the list: "Selectedrows", I need something like "t9", "t8", "t7" instead of row number.

names<- c("t1","t10","t11","t2","t3","t4","t5","t6","t7","t8","t9")
values1 <- c(2,3.1,4.5,5.1,6.5,7.1,8.5,9.11,10.1,11.8,12.3)
values2 <- c(1,3.1,3,5.1,6.5,7.1,8.5,9.11,10.1,12,12)
mydf<- data.frame(names,values1,values2)


Selectedrows<- lapply(2:3, function(col_index) {
max_values <- sort(mydf[[col_index]], decreasing = T)[1:3]
max_rows <- sapply(max_values, function(one_value){
as.numeric(rownames(mydf[mydf[[col_index]] == one_value, ]))
})

unique(unlist(max_rows))[1:3]

})


Thanks

Answer Source

In this case, you can use order, which returns the index which can be used to sort an array, the first three elements from the order results corresponds to the index of the top 3 values if you specify the array to be decreasing; With the index, you can subset the names column and get the corresponding values:

lapply(2:3, function(col_index) {
    mydf[["names"]][order(mydf[[col_index]], decreasing = T)[1:3]]
})

#[[1]]
#[1] "t9" "t8" "t7"

#[[2]]
#[1] "t8" "t9" "t7"

Notice this assumes your names column is character, if not. Do mydf$names <- as.character(mydf$names) firstly if you want the result to be character instead of factor.