Sly Grogger 25 - 1 year ago 72
R Question

# Replace multiple columns with column mean of non-zero values

I have data like so:

``````aye <- c(0,0,3,4,5,6)
bee <- c(3,4,0,0,7,8)
see <- c(9,8,3,5,0,0)
df <- data.frame(aye, bee, see)
``````

I am looking for a concise way to create columns based on the mean for each of the columns in the data frame, where zero is kept at zero.

To obtain the mean excluding zero:

``````df2 <- as.data.frame(t(apply(df, 2, function(x) mean(x[x>0]))))
``````

I can't figure out how to simply replace the values in the column with the mean excluding zero. My approach so far is:

``````df\$aye <- ifelse(df\$aye == 0, 0, df2\$aye)
df\$bee <- ifelse(df\$bee == 0, 0, df2\$bee)
df\$see <- ifelse(df\$see == 0, 0, df2\$see)
``````

But this gets messy with many variables - would be nice to wrap it up in one function.

Why can't we just use

``````data.frame(lapply(dat, function (u) ave(u, u > 0, FUN = mean)))

#  aye bee  see
#1 0.0 5.5 6.25
#2 0.0 5.5 6.25
#3 4.5 0.0 6.25
#4 4.5 0.0 6.25
#5 4.5 5.5 0.00
#6 4.5 5.5 0.00
``````

Note, I used `dat` rather than `df` as the name of your data frame. `df` is a function in R and don't mask it.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download