Daniel Bonetti Daniel Bonetti - 8 months ago 30
R Question

R ggplot2 strange behaviour. It looks it's passing by reference

I'm trying to copy a ggplot object and then change some properties of the new copied object as, for instance, the colour line to red.

Assume this code:

df = data.frame(cbind(x=1:10, y=1:10))
a = ggplot(df, aes(x=x, y=y)) + geom_line()
b = a

Then, if I change the colour of line of variable

a$layers[[1]]$geom_params$colour = "red"

it also changes the colour of

> b$layers[[1]]$geom_params$colour
[1] "red" # why it is not "black"?

I wish I could have two different objects
with different characteristics. So, in order to do this in the correct way, I would need to call the plot again for
b = ggplot(df, aes(xy, y=z)) + geom_line()
. However, at this time in the algorithm, there is no way to know the plot command
ggplot(df, aes(x=x, y=y)) + geom_line()

Do you know what's wrong with this? Is ggplot objects treated in a different manner?



The issue here is that ggplot uses the proto library to mimic OO-style objects. The proto library relies on environments to collect variables for objects. Environments are passed by reference which is why you are seeing the behavior you are (and also a reason no one would probably recommend changing the properties of a layer that way).

Anyway, adapting an example from the proto documentaiton, we can try to make a deep copy of the laters of the ggplot object. This should "disconnect" them. Here's such a helper function

duplicate.ggplot<-function(x) {
    r$layers <- lapply(r$layers, function(x) {
        as.proto(as.list(x), parent=x)

so if we run

df = data.frame(cbind(x=1:10, y=1:10))
a = ggplot(df, aes(x=x, y=y)) + geom_line()
b = a
c = duplicate.ggplot(a)

a$layers[[1]]$geom_params$colour = "red"

then plot all three, we get

enter image description here

which shows we can change "c" independently from "a"