runningbirds - 8 months ago 25

R Question

Surely there has to be a function out there in some package for this?

I've searched and I've found this function to calculate the mode:

`Mode <- function(x) {`

ux <- unique(x)

ux[which.max(tabulate(match(x, ux)))]

}

But I'd like a function that lets me easily calculate the 2nd/3rd/4th/nth most common value in a column of data.

Ultimately I will apply this function to a large number of

`dplyr::group_by()`

Thank you for your help!

Answer

Maybe you could try

```
f <- function (x) with(rle(sort(x)), values[order(lengths, decreasing = TRUE)])
```

This gives unique vector values sorted by decreasing frequency. The first will be the mode, the 2nd will be 2nd most common, etc.

Another method is to based on `table()`

:

```
g <- function (x) as.numeric(names(sort(table(x), decreasing = TRUE)))
```

But this is not recommended, as input vector `x`

will be coerced to factor first. If you have a large vector, this is very slow. Also on exit, we have to extract character names and of the table and coerce it to numeric.

**Example**

```
set.seed(0); x <- rpois(100, 10)
f(x)
# [1] 11 12 7 9 8 13 10 14 5 15 6 2 3 16
```

Let's compare with the contingency table from `table`

:

```
tab <- sort(table(x), decreasing = TRUE)
# 11 12 7 9 8 13 10 14 5 15 6 2 3 16
# 14 14 11 11 10 10 9 7 5 4 2 1 1 1
as.numeric(names(tab))
# [1] 11 12 7 9 8 13 10 14 5 15 6 2 3 16
```

So the results are the same.