drmariod drmariod - 29 days ago 9
R Question

Calculate means of variables in melted dataframe, grouped by name1

I have a problem using melt and mean on a data frame. Maybe it is just a beginner problem.

I have a data frame with two conditions and 3 replicates

set.seed(1)
testdf <- data.frame(name1=letters[1:5],con1_1=floor(runif(5,20,35)),con1_2=floor(runif(5,20,35)),con1_3=floor(runif(5,20,35)),con2_1=floor(runif(5,20,35)),con2_2=floor(runif(5,20,35)),con2_3=floor(runif(5,20,35)))

print(testdf)
name1 con1_1 con1_2 con1_3 con2_1 con2_2 con2_3
1 a 23 33 23 27 34 25
2 b 25 34 22 30 23 20
3 c 28 29 30 34 29 25
4 d 33 29 25 25 21 33
5 e 23 20 31 31 24 25


Now I melt the data frame to get something like this and I wand to calculate the mean per condition...

melt(testdf[1:7],id.vars='name1')
name1 variable value mean_con1 mean_con2
1 a con1_1 23 26.33333 28.66667
2 b con1_1 25 27.00000 24.33333
3 c con1_1 28 29.00000 29.33333
4 d con1_1 33 29.00000 26.33333
5 e con1_1 23 24.66667 26.66667
6 a con1_2 33 26.33333 28.66667
7 b con1_2 34 27.00000 24.33333
8 c con1_2 29 29.00000 29.33333
9 d con1_2 29 29.00000 26.33333
10 e con1_2 20 24.66667 26.66667
11 a con1_3 23 26.33333 28.66667
...


Is there a way to calculate the mean with the melted data frame, or do I have to recreate the original data frame and calculate the mean like below?

testdf$mean_con1 <- apply(testdf[2:4],1,mean)
testdf$mean_con2 <- apply(testdf[5:7],1,mean)


EDIT:

I just realised, I did a mistake in my question. Your answers are right, but what I am actually looking for would be the means in just one column. Alternating depending on the variable...
Does this make sense?

melt(testdf[1:7],id.vars='name1')
name1 variable value mean
1 a con1_1 23 26.33333
2 b con1_1 25 27.00000
3 c con1_1 28 29.00000
4 d con1_1 33 29.00000
5 e con1_1 23 24.66667
6 a con1_2 33 26.33333
7 b con1_2 34 27.00000
...
16 a con2_1 27 28.66667
17 b con2_1 30 24.33333
18 c con2_1 34 29.33333
19 d con2_1 25 26.33333
20 e con2_1 31 26.66667
21 a con2_2 34 28.66667
22 b con2_2 23 24.33333
...


Ok, maybe I explain what I am trying to do. I want to plot the data with ggplot, I want to make bars representing the mean and the values as geom_points as an overlay. I want to color the dots by replicates and group everything by the condition... Hope someone has an idea...

Answer

using base functions

 df1 <- reshape(testdf, idvar="name1", varying=2:7, direction="long",sep="_")
 aggregate(df1[,3:4], list(name1=df1[,1]), FUN=mean)
 #  name1     con1     con2
 #1     a 26.33333 28.66667
 #2     b 27.00000 24.33333
 #3     c 29.00000 29.33333
 #4     d 29.00000 26.33333
 #5     e 24.66667 26.66667

Or you can use data.table to calculate mean after it was reshaped

 library(data.table)
 setDT(df1)[, lapply(.SD, mean),.SDcols=c("con1", "con2"), by=name1]
 #   name1     con1     con2
 #1:     a 26.33333 28.66667
 #2:     b 27.00000 24.33333
 #3:     c 29.00000 29.33333
 #4:     d 29.00000 26.33333
 #5:     e 24.66667 26.66667

Update

If you need it in the long format

 library(reshape2)
 res <-  within( melt(testdf, id="name1"), {
         variable<- gsub("\\_.*","",variable)
          Mean<- ave(value, name1,variable)})

head(res)
#  name1 variable value     Mean
#1     a     con1    23 26.33333
#2     b     con1    25 27.00000
#3     c     con1    28 29.00000
#4     d     con1    33 29.00000
#5     e     con1    23 24.66667
#6     a     con1    33 26.33333

library(ggplot2)
res$repl <- rep(rep(1:3, each=5),2)
p <- ggplot(res, aes(name1, Mean))
p +
 geom_bar(position="dodge", stat="identity", fill="LightBlue") +
 geom_point(data=res, aes(name1, value, colour=repl))+
 facet_wrap(~variable) +
 theme_bw()

enter image description here

Comments