jrubins jrubins - 8 months ago 40
R Question

Majority Voting in R

I have a dataframe that I want to calculate the majority vote by a factor, e.g.

item category
1 2
1 3
1 2
1 2
2 2
2 3
2 1
2 1

The output should be

item majority_vote
1 2
2 NA

You may recognize the example data from here, but I don't want the Mode, I want to get the actual majority vote (meaning more than 1/2 the people selected that option). Hence 'item 2' should have no majority.

doesn't seem to help me because
will only give me the modal value. I need to know 3 things, the number of votes I have, the name of that option, and the number of times someone voted for an option. I can get the first two with
tapply(all_results_filtered$q1, all_results_filtered$X_row_id ,function(x) length(x))
tapply(all_results_filtered$q1, all_results_filtered$X_row_id ,function(x) as.numeric(names(which.max(table(x)))))
, but how can I get the number of the votes for

Or... is there some simpler way that I'm missing?


Here is a dplyr option:

df %>% 
      group_by(item, category) %>% 
      mutate(votes = n()) %>% 
      group_by(item) %>% 
      summarise(majority_vote = category[votes > n()/2][1])

# A tibble: 2 x 2
#   item majority_vote
#  <int>         <int>
#1     1             2
#2     2            NA