Sana Ali Sana Ali - 5 months ago 23
R Question

creating function to subset the data frame and then take mean of particular column in r

Hoping to get some help on this
I have a data frame :

df<- data.frame(gem = c(Ruby, Opal, Topaz, Ruby, Ruby,Opal),
cut = c(2,3,4,5,6,2))


Now the function I am aiming to make is to take subset first i.e. where gem is Ruby and then take mean of cut from that subset.

I have tried using following:

abc <- function(x,column1,val,coulmn2){
x%>%
subset(column1 %in% val)%>%
mean(na.omit(column2))}
abc(df,gem,"Ruby",cut)


This is not working but in above example ideally the answer should be 4.3

Answer Source

So you don't even have to write a function, there's a bunch of ways to do that, for example:

> aggregate(cut~gem, data=df, mean, na.rm=T)
    gem      cut
1  Opal 2.500000
2  Ruby 4.333333
3 Topaz 4.000000

Or

> tapply(df$cut, df$gem, mean, na.rm=T)
    Opal     Ruby    Topaz 
2.500000 4.333333 4.000000 

If you really want to write a function that only gives out one value, then a base package one is:

> abc<- function(df, column1, val, column2){
+   mean(df[which(df[,column1] == val), column2], na.rm=T)
+   }
> abc(df, "gem", "Ruby", "cut")
[1] 4.333333