dmvianna - 1 year ago 77
R Question

# for each row, return the column name of the largest value

I have a roster of employees, and I need to know at what department they are in most often. It is trivial to tabulate employee ID against department name, but it is trickier to return the department name, rather than the number of roster counts, from the frequency table. A simple example below (column names = departments, row names = employee ids).

``````DF <- matrix(sample(1:9,9),ncol=3,nrow=3)
DF <- as.data.frame.matrix(DF)
> DF
V1 V2 V3
1  2  7  9
2  8  3  6
3  1  5  4
``````

Now how do I get

``````> DF2
RE
1 V3
2 V1
3 V2
``````

One option using your data (for future reference, use `set.seed()` to make examples using `sample` reproducible):

``````DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(9,6,4))

colnames(DF)[apply(DF,1,which.max)]
[1] "V3" "V1" "V2"
``````

A faster solution than using `apply` might be `max.col`:

``````colnames(DF)[max.col(DF,ties.method="first")]
#[1] "V3" "V1" "V2"
``````

...where `ties.method` can be any of `"random"` `"first"` or `"last"`

This of course causes issues if you happen to have two columns which are equal to the maximum. I'm not sure what you want to do in that instance as you will have more than one result for some rows. E.g.:

``````DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(7,6,4))
apply(DF,1,function(x) which(x==max(x)))

[[1]]
V2 V3
2  3

[[2]]
V1
1

[[3]]
V2
2
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download