I would like to estimate the values of a numeric variable in a data frame based on the median of the same variable given other factors. I would then like to replace the NA's for the numeric Variable with these estimates.
I have a data frame like this:
Fac1 Fac2 Var1
A a 20
A b 30
B a 5
B b 10
A a = 22
A b = 28
B a = 12
B b = 8
You haven't provided a sample data but based on your question, I think this should work.
As @Roland mentioned no need to calculate
Assuming your dataframe as
df. For every group (here
Fac2) we calculate the median removing the
NA values. Further we select only the indices which has
NA values and replace it by its groups median value.
df$Var1[is.na(df$Var1)] <- ave(df$Var1,df$Fac1, df$Fac2, FUN=function(x) median(x, na.rm = T)[is.na(df$Var1)]