user53558 user53558 -4 years ago 146
R Question

Color Scheme In ggplot2 facet_wrap

I am making a multiple scatterplot to show interaction. I used the melt function from the reshape2 package to make my data look like this:

head(wage)
money educ exper tenure nonwhite female married numdep smsa Region Industry
1 3.10 11 2 0 White Female Notmarried 2 1 west other
2 3.24 12 22 2 White Female Married 3 1 west services
3 3.00 11 2 0 White Male Notmarried 2 0 west trade
4 6.00 8 44 28 White Male Married 0 1 west clerocc
5 5.30 12 7 2 White Male Married 1 0 west other
6 8.75 16 9 8 White Male Married 0 1 west profserv


test1 = wage %>% select(money, educ, female, nonwhite, married, smsa, Region, Industry)
test1a = melt(test1, id.vars= c('money', 'educ'))

head(test1a)

money educ variable value
1 3.10 11 female Female
2 3.24 12 female Female
3 3.00 11 female Male
4 6.00 8 female Male
5 5.30 12 female Male
6 8.75 16 female Male

tail(test1a)
money educ variable value
3151 5.65 12 Industry construc
3152 15.00 16 Industry profserv
3153 2.27 10 Industry trade
3154 4.67 15 Industry construc
3155 11.56 16 Industry nondur
3156 3.50 14 Industry profserv


The ggplot function I am using is:

ggplot(test1a, aes(educ,money, col = value )) + geom_point()+
facet_wrap(~ variable) + geom_smooth(method = 'lm', se = FALSE) +
theme(legend.position="none")


Which is giving me the following plot:
Plot

Which is exactly what I'm looking for, except I want all 6 plots to have the same color scheme. In other words, I want all 6 plots to have the same exact green/yellow plot as they have in the top left.

Any suggestions?

Answer Source

I generated some data to illustrate this answer

test1a <- data.frame(money = rnorm(10), educ = rnorm(10), 
                     variable = c("female","female","female","female","female","Industry","Industry","Industry","Industry","Industry"),
                     value = c("Female", "Female", "Male", "Male", "Female", "construc", "construc", "trade", "trade", "trade"))

        money         educ variable    value
1   0.6509500  0.822198786   female   Female
2  -0.7038793  0.257554982   female   Female
3  -0.9110664 -1.048976078   female     Male
4   0.1313963 -1.398813412   female     Male
5  -0.6050824  0.818251963   female   Female
6   1.2937046 -0.289675281 Industry construc
7   1.1670726 -0.004767622 Industry construc
8   0.3489473 -0.633061650 Industry    trade
9  -0.1536924 -0.567433569 Industry    trade
10  1.3932668 -0.010446676 Industry    trade

Libraries used

library(ggplot2)
library(dplyr)

First of all get a table of variable-values being used

uniqueVarVal <- unique(test1a[,3:4])

  variable    value
1   female   Female
3   female     Male
6 Industry construc
8 Industry    trade

The aim is to get a manual color scale for the female variable and use the same scheme for the Industry variable.

Colors to be used. I've only specified 2, you will need more colours, as some of your variables have more than 2 values.

colors <- c("red", "green")

Add the color to be used to our table of variable-values

colValues <- uniqueVarVal %>%
    group_by(variable) %>%
    mutate(color = colors[row_number()]) %>%
    ungroup()

# A tibble: 4 × 3
  variable    value color
    <fctr>   <fctr> <chr>
1   female   Female   red
2   female     Male green
3 Industry construc   red
4 Industry    trade green

Next we need to set the levels of the value variable, otherwise ggplot arranges them alphabetically.

test1a$value <- factor(test1a$value, levels = colValues$value)

Finally specify a manual color scale using the repeated pattern, red-green.

ggplot(test1a, aes(educ,money, col = value )) +
    geom_point(alpha = 0.3) +
    geom_smooth(method = 'lm', se = FALSE)  +
    scale_color_manual(values = colValues$color) +
    facet_wrap(~ variable) 

I have left the legend showing, so you can see what is happening.

Given the density of your points, I'd recommend using alpha to set transparency. enter image description here

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download