Frank_Zafka Frank_Zafka - 21 days ago 12
R Question

Plot Multiple Imputation Results

I have successfully completed a multiple imputation on the missing data of my questionnaire research using the MICE package in R and performed a linear regression on the pooled imputed variables. I can't seem to work out how to extract single pooled variables and plot in a graph. Any ideas?

e.g.

>imp <- mice(questionnaire)
>fit <- with(imp, lm(APE~TMAS+APB+APA+FOAP))
>summary(pool(fit))


I want to plot pooled APE by TMAS.

Reproducible Example using nhanes:

> library(mice)
> nhanes
> imp <-mice(nhanes)
> fit <-with(imp, lm(bmi~chl+hyp))
> fit
> summary(pool(fit))


I would like to plot pooled chl against pooled bmi (for example).

Best I have been able to achieve is

> mat <-complete(imp, "long")
> plot(mat$chl~mat$bmi)


Which I believe gives the combined plot of all 5 imputations and is not quite what I am looking for (I think).

Answer

the underlying with.mids() function lets the regression be carried out on each imputed dataframe. So it is not one regression, but 5 regressions that happened. pool() just averages the estimated coefficients and adjusts the variances for the statistical inference according to the amount of imputation.

So there aren't single pooled variables to plot. What you could do is average the 5 imputed sets and recreate some kind of "regression line" based on the pooled coefficients, eg :

# Averaged imputed data
combchl <- tapply(mat$chl,mat$.id,mean)
combbmi <- tapply(mat$bmi,mat$.id,mean)
combhyp <- tapply(mat$hyp,mat$.id,mean)

# coefficients
coefs <- pool(fit)$qbar

# regression results
x <- data.frame(
        int = rep(1,25),
        chl = seq(min(combchl),max(combchl),length.out=25),
        hyp = seq(min(combhyp),max(combhyp),length.out=25)
      )

y <- as.matrix(x) %*%coefs


# a plot
plot(combbmi~combchl)
lines(x$chl,y,col="red")