Yang Mei Lian Yang Mei Lian - 9 days ago 5
R Question

Testing mean of multiple series with ANOVA and Tukey HSD test in R

In R, I have four numerical vectors. I want to test if mean of these four vectors are the same or not, and if not, which mean is greater and which mean is smaller (basically I want to rank four vector by mean of them).

This post is quite suitable for me:

http://brownmath.com/stat/anova1

I am following this guide:

https://www.r-bloggers.com/analysis-of-variance-anova-for-multiple-comparisons/

However, this post stops at testing if four means are equal or not, but if the answer is not, how can I know which mean is greater and which mean is smaller in R?

Thank you very much,

The code from the post is:

a = c(65, 48, 66, 75, 70, 55)
b = c(64, 44, 70, 70, 68, 59)
c = c(60, 50, 65, 69, 69, 57)
d = c(62, 46, 68, 72, 67, 56)

dati = c(a, b, c, d)
groups = factor(rep(letters[1:4], each = 6))

fit = lm(formula = dati ~ groups)
anova (fit)

Answer

I'll assume that assumptions are fulfilled, in particular, that residuals are independent and that the distribution of residuals is homoskedastic and sufficiently similar to a normal distribution.

First note that your ANOVA returns a p-value of 0.99, which means there is no significant difference between group means. If the p-value was below 0.05, you could do a post-hoc test for pairwise comparisons.

TukeyHSD(aov(fit))
#  Tukey multiple comparisons of means
#    95% family-wise confidence level
#
#Fit: aov(formula = fit)
#
#$groups
#          diff       lwr      upr     p adj
#b-a -0.6666667 -15.68081 14.34748 0.9992898
#c-a -1.5000000 -16.51415 13.51415 0.9921250
#d-a -1.3333333 -16.34748 13.68081 0.9944302
#c-b -0.8333333 -15.84748 14.18081 0.9986181
#d-b -0.6666667 -15.68081 14.34748 0.9992898
#d-c  0.1666667 -14.84748 15.18081 0.9999888

Of course, there is nothing significant here.

A bit more powerful is the multcomp package. It let's you define different pairwise comparisons. It also can give you the group letters that you often see in tables or figures.

library(multcomp)
pairwise <- glht(fit, linfct = mcp(groups = "Tukey"))
summary(pairwise)
cld(pairwise)
#  a   b   c   d 
#"a" "a" "a" "a"

Again, since there is nothing significant in your example, all factor levels get the same letter.

Comments