dmvianna - 8 months ago 34

R Question

I have a roster of employees, and I need to know at what department they are in most often. It is trivial to tabulate employee ID against department name, but it is trickier to return the department name, rather than the number of roster counts, from the frequency table. A simple example below (column names = departments, row names = employee ids).

`DF <- matrix(sample(1:9,9),ncol=3,nrow=3)`

DF <- as.data.frame.matrix(DF)

> DF

V1 V2 V3

1 2 7 9

2 8 3 6

3 1 5 4

Now how do I get

`> DF2`

RE

1 V3

2 V1

3 V2

Answer

One option using your data (for future reference, use `set.seed()`

to make examples using `sample`

reproducible):

```
DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(9,6,4))
colnames(DF)[apply(DF,1,which.max)]
[1] "V3" "V1" "V2"
```

A faster solution than using `apply`

might be `max.col`

:

```
colnames(DF)[max.col(DF,ties.method="first")]
#[1] "V3" "V1" "V2"
```

...where `ties.method`

can be any of `"random"`

`"first"`

or `"last"`

This of course causes issues if you happen to have two columns which are equal to the maximum. I'm not sure what you want to do in that instance as you will have more than one result for some rows. E.g.:

```
DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(7,6,4))
apply(DF,1,function(x) which(x==max(x)))
[[1]]
V2 V3
2 3
[[2]]
V1
1
[[3]]
V2
2
```