digEmAll digEmAll - 1 month ago 5
R Question

Unexpected behavior of `class<-`()

I'm having troubles in understanding the difference in the following codes :

First piece :

a <- 123
class(a) <- 'FOO'

b <- a
class(b) <- 'BAR'

class(a) # returns 'FOO'
class(b) # returns 'BAR'


Second piece :

a <- 123
`class<-`(a,'FOO')

b <- a
`class<-`(b,'BAR')

class(a) # returns 'BAR' ! so class attribute has been replaced also on "a"
class(b) # returns 'BAR'


As far as I know,
b <- a
should create a copy of
a
, not immediately, but as soon as
b
is modified.

But looking at the second case, it seems that using
`class<-(x,"")
function (which I expected to be just the non-syntactic sugar of
class(x)<-""
) the copy is not created and the original object is modified instead.

Am I missing something (maybe in the documentation) ?

Tested in R version 3.2.5

Answer

This is an R bug present in older versions of R. A recent version of R-devel already works as expected - the object is not modified in place. It was fixed in R-devel version 70636.

Please note that the example is not quite correct. Because of the value semantics of R, functions are not allowed to modify their arguments in-place (except environments). Therefore, functions like class<- have to modify a copy of their argument, and they return this copy, so the example should have used a <- 'class<-'(a,'FOO') instead of just 'class<-'(a, 'FOO').

When a replacement function (like 'class<-') is implemented in R, there is no way of violating the value semantics. But, in an assignment like a <- class(a, 'FOO'), we see that the old version of 'a' without the attribute set will not be used again. For efficiency, many replacement functions are implemented in C and they violate the value semantics of R by modifying their argument in-place, but only when it is known that the value of that argument is not used by any other variable. Incidentally this optimization had been (re-)added to class<- in the respective version of R.

I believe that the function 'class<-' (and similar) should not be called directly by R programs. In this simple case it should be as simple as described in the question, but in general mapping replacement calls to calls to functions like 'class<-' is more complicated. Also, the implementation currently differs between the AST interpreter and the byte-code interpreter. More information can currently be found in R language definition (for the AST interpreter) and in compiler documentation, but those are implementation details that R programs should not rely on. R programs should always use the class(x)<- form.