Endre - 6 months ago 47

R Question

I would like to create an interaction marginal effects plot where the histogram of the predictor is in the background of the plot. The issue is somewhat complicated since the marginal effects plot is also faceted. I would like the end result to look like something the MARHIS package does in Stata. In case of continuous predictors, I just use

`geom_rug`

`geom_histogram`

`ggplot(newdat, aes(cenretrosoc, linetype = factor(wave))) +`

geom_line(aes(y = cengovrec), size=0.8) +

scale_linetype_manual(values = c("dotted", "twodash", "solid")) +

geom_line(aes(y = plo,

group=factor(wave)), linetype =3) +

geom_line(aes(y = phi,

group=factor(wave)), linetype =3) +

facet_grid(. ~ regioname) +

xlab("Economy") +

ylab("Disapproval of Record") +

labs(linetype='Wave') +

theme_minimal()

which works, and produces this graph: 1

However, when I add the histogram bit

`+ geom_histogram(aes(cenretrosoc), position="identity", linetype=1,`

fill="gray60", data = data, alpha=0.5)

this is what happens:2

I think it's because of the different scales the predicted probabilities and the histogram has on the Y axis. But I am not sure how to solve this. Any ideas?

Here's a reproducible example to illustrate the issue (it needs the WWGbook package for the data it uses)

`# install.packages("WWGbook")`

# install.packages("lme4")

# install.packages("ggplot2")

require("WWGbook")

require("lme4")

require("ggplot2")

# checking the dataset

head(classroom)

# specifying the model

model <- lmer(mathgain ~ yearstea*sex*minority

+ (1|schoolid/classid), data=classroom)

# dataset for prediction

newdat <- expand.grid(

mathgain = 0,

yearstea = seq(min(classroom$yearstea, rm=TRUE),

max(classroom$yearstea, rm=TRUE),

5),

minority = seq(0, 1, 1),

sex = seq(0,1,1))

mm <- model.matrix(terms(model), newdat)

## calculating the predictions

newdat$mathgain <- predict(model,

newdat, re.form = NA)

pvar1 <- diag(mm %*% tcrossprod(vcov(model), mm))

## Calculating lower and upper CI

cmult <- 1.96

newdat <- data.frame(

newdat, plo = newdat$mathgain - cmult*sqrt(pvar1),

phi = newdat$mathgain + cmult*sqrt(pvar1))

## this is the plot of fixed effects uncertainty

marginaleffect <- ggplot(newdat, aes(yearstea, linetype = factor(sex))) +

geom_line(aes(y = mathgain), size=0.8) +

scale_linetype_manual(values = c("dotted", "twodash")) +

geom_line(aes(y = plo,

group=factor(sex)), linetype =3) +

geom_line(aes(y = phi,

group=factor(sex)), linetype =3) +

facet_grid(. ~ minority) +

xlab("First grade teacher years of teaching experience") +

ylab("Predicted Probability of student gain in math") +

labs(linetype='Sex') +

theme_minimal()

As one can see

`marginaleffect`

`marginaleffect + geom_histogram(aes(yearstea), position="identity", linetype=1,`

fill="gray60", data = classroom, alpha=0.5)

It does add the histogram, but it overwrites the OY scale with the histogram values. In this example one can still see the effects as the original predicted probabilities scale was comparable to the frequencies. However, in my case, a data-set with many values, this is not the case.

Preferably I would not have any scale for the histogram shown. It should just have a maximum which is the predicted probabilities scale maximum, so it covers the same area, but it does not overwrite the pred prob values on the vertical axis.

Answer

Check this post out: https://rpubs.com/kohske/dual_axis_in_ggplot2

I followed the steps and was able to get the following plot:

Also, note that I've added `scale_y_continuous(expand = c(0,0))`

to make histograms start from the x-axis.

```
m1 <- marginaleffect + geom_line(aes(y = mathgain), size=0.8) +
scale_linetype_manual(values = c("dotted", "twodash")) +
geom_line(aes(y = plo,
group=factor(sex)), linetype =3) +
geom_line(aes(y = phi,
group=factor(sex)), linetype =3)
m2 <- marginaleffect + geom_histogram(aes(yearstea), position="identity", linetype=1,
fill="gray60", data = classroom, alpha=0.5, bins = 30) +
scale_y_continuous(expand = c(0,0))
g1 <- ggplot_gtable(ggplot_build(m1))
g2 <- ggplot_gtable(ggplot_build(m2))
library(gtable)
pp <- c(subset(g1$layout, name == "panel", se = t:r))
# The link uses g2$grobs[[...]] but it doesn't seem to work... single bracket works, on the other hand....
g <- gtable_add_grob(g1, g2$grobs[which(g2$layout$name == "panel")], pp$t, pp$l, pp$b, pp$l)
library(grid)
grid.draw(g)
```