The Bryer Likert package has many useful features for plotting diverging bar charts of Likert-type data. However, one basic feature is missing -- there does not appear to be any way to show the total number of sample points for each question/group when printing out a bar chart. If one wants to include the histogram chart, then these n-values will appear in the histogram. But often I find the histogram makes the entire plot too busy.
For example, using the pisa dataset, I can plot a diverging bar chart for results grouped by country below.
items28 <- pisaitems[, substr(names(pisaitems), 1, 5) == "ST24Q"]
# Create the likert object using country as a grouping variable.
l28g <- likert(items28, grouping = pisaitems$CNT)
# Optional - print a summary.
# Plot the bar chart.
margin.table(table(pisaitems$CNT, items28$ST24Q01), 1)
In this case the counts don't vary by question, so you only need one table for number of responses. Below are ways to put number of responses next to each question, for cases where the number of responses varies, or as a single table.
One way to do this would be to modify the underlying code for
likert.bar.plot to include the ability to add counts of responses. Here I've just hacked the output of
likert.bar.plot to add the response counts after the fact.
library(dplyr) library(gridExtra) library(reshape2)
First, get response counts by
Item for each
variable=NA at the end is there because the original data frame that
likert.bar.plot generates in creating the plot creates and uses a column called
variable. Even though we don't use that column in our subsequent call to
geom_text with the new data frame below,
ggplot still expects that colunmn to be present in the new data frame.
counts = pisaitems %>% select(CNT, matches("ST24Q")) %>% melt(id.var="CNT", variable.name="Item") %>% count(CNT, Item) %>% mutate(variable=NA)
geom_text to add response counts by item, but we need to make a few other changes to the output of
plot(l28g), as follows:
Expand the y-axis limits using
scale_y_continuous out to 150 so that the text values (which I've put at 145) will be visible. This overrides the y-scale in the original plot created by
plot(l28g) (which calls
likert.bar.plot to actually produce the plot).
Set the visible y-axis range to stop at 110. We do this inside
coord_flip(), which overrides the original
likert.bar.plot. We do this so that the text for the number of responses will be just to the right of the plot area, rather than inside it.
Increase the right plot margin, so that there will be some space to the right of the plot.
Turn off clipping, so that text printed outside the plot area will be visible.
Here's the plot code. It might take several seconds to render, so be patient.
p = plot(l28g) + geom_text(data=counts, aes(label=format(n,big.mark=","), x=CNT, y=145), size=2.5, colour="grey30", hjust=1) + scale_y_continuous(limits=c(-100,150)) + coord_flip(ylim=c(-110,110)) + theme(plot.margin=unit(c(0.2,2,0.2,0.2),"cm")) # Turn off clipping # http://stackoverflow.com/a/9691256/496488 p <- ggplot_gtable(ggplot_build(p)) p$layout$clip <- "off" grid.draw(p)
One option would be to create a table grob (grob = graphical object) and lay it out along side or below the main plot. For example:
library(dplyr) library(gridExtra) library(reshape2) tt <- ttheme_default( core=list(fg_params=list(fontsize=9)), colhead=list(fg_params=list(fontsize=9)), rowhead=list(fg_params=list(fontsize=9))) grid.arrange(plot(l28g), arrangeGrob(nullGrob(), textGrob("Number of Responses", gp=gpar(fontsize=11,fontface="bold")), tableGrob(pisaitems %>% rename(Country=CNT) %>% count(Country) %>% mutate(n=format(n, big.mark=",")), theme=tt, rows=NULL), nullGrob(), heights=c(15,1,5,15)), widths=c(3,1))