stackoverflowuser2010 stackoverflowuser2010 - 3 months ago 19
R Question

Replace NA with mean based on row subset matching another column?

I have data where each row contains a person's gender and weight (in lbs):

genders <- c("FEMALE", "FEMALE", "FEMALE", "FEMALE", "FEMALE", "MALE", "MALE", "MALE", "MALE")
weights <- c(110.0, 120.0, 112.0, NA, NA, 190.0, 202.0, 195.0, NA)

df <- data.frame(gender=genders, weight=weights)
df
# gender weight
# 1 FEMALE 110
# 2 FEMALE 120
# 3 FEMALE 112
# 4 FEMALE NA
# 5 FEMALE NA
# 6 MALE 190
# 7 MALE 202
# 8 MALE 195
# 9 MALE NA


For every row that an NA in the
weight
column, I would like to replace / impute the NA with the
weight
mean, but the mean should be calculated using only the rows that match the same
gender
value as the row with the NA.

Specifically, rows 4 and 5 have a
gender
of FEMALE and
weight
of NA. I want to replace the NA with the mean
weight
computed over the subset of rows that match the
gender
of FEMALE. In this case, the mean would be (110+120+112)/3=114.0 from the other rows 1, 2, and 3.

Likewise, I want to replace the NA in row 9 with the mean of the weights for MALE
gender
.

I tried the following command, but it replaced the NA with the mean weight over all users across both genders, which is not what I want.

df$weight[is.na(df$weight)] <- mean(subset(df, gender=df$gender)$weight, na.rm=T)
df
# gender weight
# 1 FEMALE 110.0000
# 2 FEMALE 120.0000
# 3 FEMALE 112.0000
# 4 FEMALE 154.8333
# 5 FEMALE 154.8333
# 6 MALE 190.0000
# 7 MALE 202.0000
# 8 MALE 195.0000
# 9 MALE 154.8333


I searched other questions, but they are not quite the same problem as mine:

"Replace NA with mean matching the same ID"

"How to replace NA with mean by subset in R (impute with plyr?)"

"How to replace NA values in a table for selected columns? data.frame, data.table"

Answer

You could use ave() with replace() (or standard manual replacement).

df$weight <- with(df, ave(weight, gender,
    FUN = function(x) replace(x, is.na(x), mean(x, na.rm = TRUE))))

which gives

  gender   weight
1 FEMALE 110.0000
2 FEMALE 120.0000
3 FEMALE 112.0000
4 FEMALE 114.0000
5 FEMALE 114.0000
6   MALE 190.0000
7   MALE 202.0000
8   MALE 195.0000
9   MALE 195.6667
Comments