Rory Shaw Rory Shaw - 2 months ago 22
R Question

Extract values from nested list of summary(aov()) into a dataframe

I am running a simple one-way ANOVA across multiple groups within a single data frame.

Dataframe available here: https://www.dropbox.com/s/6nsjk4l1pgiwal3/cut1.csv?dl=0

>download.file('https://www.dropbox.com/s/6nsjk4l1pgiwal3/cut1.csv?raw=1', destfile = "cut1.csv", method = "auto")

> data <- read.csv("cut1.csv")
> cut1 <- data %>% mutate(Plot = as.factor(Plot), Block = as.factor(Block), Cut = as.factor(Cut))

> str(cut1)
'data.frame': 160 obs. of 6 variables:
$ Plot : Factor w/ 16 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Block : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 2 2 2 2 3 3 ...
$ Treatment : Factor w/ 4 levels "AN","C","IU",..: 4 2 3 1 1 3 4 2 3 1 ...
$ Cut : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
$ Measurement: Factor w/ 10 levels "ADF","Ash","Crude_Protein",..: 5 5 5 5 5 5 5 5 5 5 ...
$ Value : num 956 965 961 963 955 ...


I used some code from this SO question to enable the aov function to be applied to every level of
Measurement
factor:

anova_1<- sapply(unique(as.character(cut1$Measurement)),
function(meas)aov(Value~Treatment+Block,cut1,subset=(Measurement==meas)),
simplify=FALSE,USE.NAMES=TRUE)
summary_1 <- lapply(anova_1, summary)


I can look manually through
summary_1
but ideally what I would like to do is extract the p values for each level of the
Measurement
factor into a dataframe which I could then filter so that I only see which ones are <0.5. I would then run
TukeyHSD
on these.

summary_1
looks like this (only first 2 lists shown):

> str(summary_1)
List of 10
$ Dry_matter :List of 1
..$ :Classes ‘anova’ and 'data.frame': 3 obs. of 5 variables:
.. ..$ Df : num [1:3] 3 3 9
.. ..$ Sum Sq : num [1:3] 359 167 612
.. ..$ Mean Sq: num [1:3] 119.8 55.5 68
.. ..$ F value: num [1:3] 1.761 0.816 NA
.. ..$ Pr(>F) : num [1:3] 0.224 0.517 NA
..- attr(*, "class")= chr [1:2] "summary.aov" "listof"
$ Crude_Protein:List of 1
..$ :Classes ‘anova’ and 'data.frame': 3 obs. of 5 variables:
.. ..$ Df : num [1:3] 3 3 9
.. ..$ Sum Sq : num [1:3] 306 721 1606
.. ..$ Mean Sq: num [1:3] 102 240 178
.. ..$ F value: num [1:3] 0.572 1.347 NA
.. ..$ Pr(>F) : num [1:3] 0.647 0.319 NA
..- attr(*, "class")= chr [1:2] "summary.aov" "listof"


I can extract the p value from one of the lists in
summary_1
like this:

> summary_1$OAH[[1]][,5][1]
[1] 0.4734992


However, I dont know how to extract from all the nested lists and place in a dataframe.

Much obliged for any help.

Answer

You can use the package broom in combination with dplyr to apply Anova by Measurement, and assign the output to a data.frame in a tidy format.

library(broom)
library(dplyr)

summaries <- cut1 %>% group_by(Measurement) %>% 
        do(tidy(aov(Value ~ Treatment + Block, data = .)))

head(summaries)
#  Measurement      term    df      sumsq    meansq statistic    p.value
#       (fctr)     (chr) (dbl)      (dbl)     (dbl)     (dbl)      (dbl)
#1         ADF Treatment     3  41.416875 13.805625  3.097871 0.07138437
#2         ADF     Block     1   8.001125  8.001125  1.795388 0.20729351
#3         ADF Residuals    11  49.021375  4.456489        NA         NA
#4         Ash Treatment     3  38.511875 12.837292  1.051787 0.40840601
#5         Ash     Block     1  34.980125 34.980125  2.865998 0.11856463
#6         Ash Residuals    11 134.257375 12.205216        NA         NA