I was wondering if there is a way to subtract two binned scatter plots from one another in R. I have two distributions with the same axes and want to overlay one on top of the other and subtract them hence producing a difference scatter plot.
Here are my two plots:
and my script for the plots:
df <- read.table("data1.txt")
d <- plot(bin, main="" , colramp=my_colors, legend=F)
Alright, as a starting point, here is some sample data. Each is random, with one shifted to (2,2).
df1 <- data.frame( x = rnorm(1000) , y = rnorm(1000) ) df2 <- data.frame( x = rnorm(1000, 2) , y = rnorm(1000, 2) )
To ensure that the bins are identical, it is best to construct one
hexbin object. To accomplish this, I am using
bind_rows to keep a track of which data.frame the data came from (this would be even easier if you had a single data.frame with a grouping variable).
bothDF <- bind_rows(A = df1, B = df2, .id = "df") bothHex <- hexbin(x = bothDF$x , y = bothDF$y , IDs = TRUE )
Next, we are using a mix of
dplyr to count the occurrences of each within each cell. First, apply across the bins, constructing a table (needs to use
factor to make sure all levels are shown; not needed if your column is already a factor). Then, it simplifies it and constructs a data.frame that is then manipluated with
mutate to calculate the difference in counts and then joined back to a table that gives the x and y values for each of the id's.
counts <- hexTapply(bothHex, factor(bothDF$df), table) %>% simplify2array %>% t %>% data.frame() %>% mutate(id = as.numeric(row.names(.)) , diff = A - B) %>% left_join(data.frame(id = bothHex@cell, hcell2xy(bothHex)))
A B id diff x y 1 1 0 7 1 -1.3794467 -3.687014 2 1 0 71 1 -0.8149939 -3.178209 3 1 0 79 1 1.4428172 -3.178209 4 1 0 99 1 -1.5205599 -2.923806 5 2 0 105 2 0.1727985 -2.923806 6 1 0 107 1 0.7372513 -2.923806
Finally, we use
ggplot2 to plot the resulting data, as it offers more control (and the ability to more easily use a different variable than count as fills) than
counts %>% ggplot(aes(x = x, y = y , fill = diff)) + geom_hex(stat = "identity") + coord_equal() + scale_fill_gradient2()
From there, it is easy to play around with axes, colors, etc.