jrubins jrubins - 1 month ago 7
R Question

Majority Voting in R

I have a dataframe that I want to calculate the majority vote by a factor, e.g.

item category
1 2
1 3
1 2
1 2
2 2
2 3
2 1
2 1


The output should be

item majority_vote
1 2
2 NA


You may recognize the example data from here, but I don't want the Mode, I want to get the actual majority vote (meaning more than 1/2 the people selected that option). Hence 'item 2' should have no majority.

table()
doesn't seem to help me because
which.max()
will only give me the modal value. I need to know 3 things, the number of votes I have, the name of that option, and the number of times someone voted for an option. I can get the first two with
tapply(all_results_filtered$q1, all_results_filtered$X_row_id ,function(x) length(x))
and
tapply(all_results_filtered$q1, all_results_filtered$X_row_id ,function(x) as.numeric(names(which.max(table(x)))))
, but how can I get the number of the votes for
which.max(table(x))


Or... is there some simpler way that I'm missing?
Thanks!

Answer

Here is a dplyr option:

library(dplyr)
df %>% 
      group_by(item, category) %>% 
      mutate(votes = n()) %>% 
      group_by(item) %>% 
      summarise(majority_vote = category[votes > n()/2][1])

# A tibble: 2 x 2
#   item majority_vote
#  <int>         <int>
#1     1             2
#2     2            NA