James F James F - 3 months ago 8
R Question

Forest plot in R: Trying to plot more than three points

this is my first time posting a question on here so please forgive me if my question is unclear or incomplete.

My scenario: I have a dataframe that has 21 meta-analytic distributions (Distribution1-Distribution21). For each distribution, I have 10 estimates of the respective meta-analytic mean effect size (ES1-ES10). Effectively, I have a meta-analytic mean effect size and nine other estimates of this mean from a variety of sensitivity analyses (i.e., outlier and publication bias analyses).

Using adapted code (can provide link if needed; I am not able to post multiple links because I am a new user), I am able to plot three estimates of each distribution's mean estimate. To give you an idea of what I'm talking about, imagine a figure that displays the mean estimate and it's confidence intervals.

Here is the dataframe and adapted code:

x | ES1 | ES2 | ES3 | ES4 | ES5 | ES6 | ES7 | ES8 | ES9 | ES10
Distribution1 | -0.07 | -0.07 | -0.06 | -0.07 | -0.02 | -0.03 | -0.09 | -0.07 | 0.00 | 0.01
Distribution2 | -0.06 | -0.06 | -0.04 | -0.05 | -0.04 | -0.05 | -0.07 | -0.06 | -0.03 | 0.01
Distribution3 | -0.08 | -0.09 | -0.07 | -0.08 | -0.01 | -0.08 | -0.10 | -0.08 | -0.01 | 0.01
Distribution4 | -0.10 | -0.11 | -0.10 | -0.09 | -0.05 | -0.07 | -0.11 | -0.10 | -0.06 | 0.010
Distribution5 | -0.08 | -0.08 | -0.06 | -0.08 | -0.02 | -0.03 | -0.10 | -0.08 | 0.00 | 0.02
Distribution6 | -0.09 | -0.10 | -0.08 | -0.09 | -0.03 | -0.08 | -0.11 | -0.09 | -0.03 | 0.02
Distribution7 | -0.11 | -0.13 | -0.10 | -0.11 | -0.04 | -0.04 | -0.12 | -0.11 | -0.08 | 0.01
Distribution8 | -0.10 | -0.14 | -0.06 | -0.10 | -0.01 | -0.08 | -0.13 | -0.10 | -0.06 | 0.04
Distribution9 | -0.04 | -0.05 | -0.02 | -0.04 | 0.00 | -0.04 | -0.06 | -0.04 | -0.06 | 0.00
Distribution10 | -0.11 | -0.12 | -0.09 | -0.11 | -0.03 | -0.09 | -0.12 | -0.11 | -0.11 | 0.00
Distribution11 | -0.06 | -0.09 | -0.04 | -0.06 | -0.01 | -0.01 | -0.09 | -0.06 | -0.01 | 0.04
Distribution12 | -0.11 | -0.11 | -0.09 | -0.11 | -0.09 | -0.10 | -0.12 | -0.11 | -0.08 | -0.03
Distribution13 | -0.19 | -0.22 | -0.16 | -0.19 | -0.08 | -0.17 | -0.21 | -0.19 | -0.13 | -0.01
Distribution14 | -0.01 | -0.02 | 0.00 | -0.01 | 0.00 | 0.00 | -0.03 | -0.01 | -0.02 | -0.01
Distribution15 | -0.19 | -0.22 | -0.16 | -0.19 | -0.08 | -0.17 | -0.21 | -0.19 | -0.13 | -0.01
Distribution16 | -0.09 | -0.1 | -0.08 | -0.09 | -0.01 | -0.09 | -0.11 | -0.09 | -0.07 | 0.00
Distribution17 | -0.16 | -0.19 | -0.14 | -0.16 | -0.07 | -0.12 | -0.18 | -0.16 | -0.10 | 0.00
Distribution18 | -0.05 | -0.06 | -0.03 | -0.05 | -0.02 | -0.02 | -0.05 | -0.05 | -0.02 | 0.01
Distribution19 | -0.09 | -0.10 | -0.08 | -0.09 | -0.01 | -0.08 | -0.11 | -0.09 | -0.06 | 0.01
Distribution20 | -0.02 | -0.03 | -0.01 | -0.02 | 0.01 | 0.00 | -0.04 | -0.02 | 0.00 | 0.02
Distribution21 | -0.1 | -0.12 | -0.09 | -0.1 | -0.02 | -0.08 | -0.12 | -0.1 | -0.04 | 0.02

#My APA-format theme
#https://gist.github.com/akshaycuhk/01576c57149a9a3d14514c9a3c4b4b1d

install.packages("ggplot2")
library(ggplot2)

apatheme=theme_bw()+
theme(panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
panel.border=element_blank(),
axis.line=element_line(),
text=element_text(family='Times'),
legend.position='bottom', axis.text=element_text(size=14),
axis.title=element_text(size=14,face="bold"))

credplot.gg <- function(d){
# d is a data frame with 4 columns
# d$x gives variable names
# d$y gives center point
# d$ylo gives lower limits
# d$yhi gives upper limits
require(ggplot2)
p <- ggplot(d, aes(x=x, y=ES1, ymin=ES2, ymax=ES3))+
geom_pointrange()+
geom_hline(yintercept = 0, linetype=2)+
coord_flip()+
xlab('Distribution')+
ylab('Effect size')
return(p)
}

# load your data below
d <- read.table(file.choose(), sep=",", header=TRUE)
Fig1 <-credplot.gg(d) +xlim("Distribution1",
"Distribution2",
"Distribution3",
"Distribution4",
"Distribution5",
"Distribution6",
"Distribution7",
"Distribution8",
"Distribution9",
"Distribution10",
"Distribution11",
"Distribution12",
"Distribution13",
"Distribution14",
"Distribution15",
"Distribution16",
"Distribution17",
"Distribution18",
"Distribution19",
"Distribution20",
"Distribution21")
Fig1


I am not yet able to embed images so here is a link to the data file, script, and figure:
https://www.dropbox.com/sh/aczv1dw5mjmone8/AACqekiFVdJqeA1cRvIvs7NFa?dl=0

My question: Is there a way for me to increase the number of point estimates from three to ten? Specifically, can I plot all ten estimates (ES1 -> ES10) for all 21 distributions (Distribution1 -> Distribution21)? Ideally, each point estimate would have its own shape/marker on the line to distinguish it from the others and a legend would accompany the figure.

Thanks to anyone who is willing to help me :)

Answer

Is this what you are trying for? It involves reshaping your dataset into long format, adding points with different shapes per "E" category and then drawing lines through the points for each "Distribution" to emulate a forest plot.

library(reshape2)
dat2 = melt(d, id.vars = "x")

# Set x factor order in order that appears in data
dat2$x = factor(dat2$x, levels = unique(dat2$x))

ggplot(dat2, aes(x=x, y= value))+
    geom_point(aes(shape = variable)) +
    geom_line() +
    scale_shape_manual(values = 0:9) +
    geom_hline(yintercept = 0, linetype=2) +
    coord_flip() +
    xlab('Distribution') +
    ylab('Effect size')

enter image description here Note that things get ugly fast when using this many shapes. See here for some shape options.