I have a data set in which a coordinate can be repeated several times.
I want to make a hexbinplot displaying the maximum number of times a coordinate is repeated within that bin. I am using R and I would prefer to make it with ggplot so the graph is consistent with other graphs in the same report.
Minimum working example (the bins display the count not the max):
# No bin should be over 9
p1 <- ggplot(dat,aes(x=x,y=y))+stat_binhex(bins=10)
I think, after playing with the data a bit more, I now understand. Each bin in the plot represents multiple points, e.g., (9,9);(9,10)(10,9);(10,10) are all in a single bin in the plot. I must caution that this is the expected behavior. It is unclear to me why you do not want to do it this way. Instead, you seem to want to display the values of just one of those points (e.g. 9,9).
I don't think you will be able to do this directly in a call to
stat_hexbin, as those functions are trying to faithfully represent all of the data. In fact, they are not necessarily expecting discrete coordinates like you have at all -- they work equally well on continuous data.
For your purpose, if you want finer control, you may want to instead use
geom_tile and count the values yourself, eg. (using
countedData <- dat %$% table(x,y) %>% as.data.frame() ggplot(countedData , aes(x = x , y = y , fill = Freq)) + geom_tile()
and you might play with the representation a bit from there, but it would at least display each of the separate coordinates more faithfully.
Alternatively, you could filter your raw data to only include the points that are the maximum within a bin. That would require you to match the binning, but could at least be an option.